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Abstract 

A  new  type  of  software  test,  called  mutation  analysis, 
is  introduced.  A  method  of  applying  mutation  analysis  is 
described,  and  the  design  of  several  existing  automated 
systems  for  applying  mutation  analysis  to  Fortran  and  Cobol 
programs  is  sketched.  These  systems  have  been  the  means  for 
preliminary  studies  of  the  efficiency  of  mutation  analysis 
and  of  the  relationship  between  mutation  and  other 
systematic  testing  techniques.  The  results  of  several  ex¬ 
periments  to  determine  the  effectiveness  of  mutation 
analysis  are  described,  and  examples  are  presented  to  il¬ 
lustrate  the  way  in  which  the  technique  can  be  used  to 
detect  a  wide  class  of  errors,  including  many  previously 
defined  and  studied  in  the  literature.  Finally,  a  number  of 
empirical  studies  are  suggested,  the  results  of  which  may 
add  confidence  to  the  outcome  of  the  mutation  analysis  of  a 
program . 
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Mutations  are  seldom  spectacular.  Those 
mutants  that  are  startlingly  different 
from  their  parents  tend  not  to  survive 
long,  either  because  t h e  mutation  ren¬ 
ders  them  unable  to  function  normally, 
or  because  they  are  rejected  by  those 
who  sired  them. 

Robert  Silverberg 


1.  INTRODUCTION 

A  major  goal  of  software  engineering  is  to  discover  an 
efficiently  testable  property  of  programs,  say  PROF,  so  that 
for  all  programs  P  the  following  holds: 

if  PROP(P),  then  Pis  correct.  (1) 

By  correct  one  usually  means  that  for  all  possible  input 
values  x , 

P»  (  x)  =  f(  x)  . 

where  P*(x)  is  the  function  computed  by  program  P  and  f  is  a 
function  which  specifies  the  intended  behavior  of  the 
program . 

Prominent  examples  of  such  properties  are  program 
verification  and  testing  for  correctness: 

Program  Verification  [Man] 

Let  A  and  B  be  predicates  so  that  A(x)  is  true  when  x 
is  in  the  domain  of  the  function  f  and  B( y)  is  true  when 

y  =  f(  x)  . 

Then 

PROP(P)  if  and  only  if  !-  A 1 P  1  B 
can  be  used  to  define  the  predicate  PROT  in  proposition  (11. 

Testing  for  correctness  [ LMW  ] 

Let  D  be  a  subset  of  all  possible  input  to  the  program 
P ,  and  say  that  D  is  a  reliable  test  data  set  if 
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P(  x )  =  f(  x  )  for  all  x  in  D  implies  P#=f. 


Clearly  , 

PROP(P)  if  and  only  if  P*(D)=f(D)  for  some  reliable  D 

can  also  be  used  as  a  property  for  (1). 

A  major  stumbling  block  in  such  systematizations  as  these 
has  been  that  the  conclusion  of  proposition  (1)  is  so  strong 
that,  except  for  trivial  classes  of  programs  P,  PROP(P)  is 
bound  to  be  formally  undecidable  [Howl],  Given  this  state 
of  affairs,  program  verification  has  turned  to  techniques 
which  do  not  require  universal  applicability.  It  has  not 
been  clear  what  the  corresponding  course  should  be  for 
program  testing,  however.  There  is  an  undeniable  tendency 
among  practitioners  to  relegate  testing  to  completely  ad-hoc 
techniques:  one  creates  tests  that  seem  to  capture  the  es¬ 
sence  of  the  program,  observes  the  execution  of  the  program 
on  those  tests  and  makes  a  conclusion  about  the  correctness 
of  the  program  based  on  the  results  of  the  observations. 
This  strategy  seems  to  be  too  undisciplined  [DLS1  ].  More 
systematic  techniques  attempt  to  augment  a  programmer's 
intuition  by  yielding  quantitative  information  about  the 
degree  to  which  a  program  has  been  tested  (see  [Good]  for  a 
current  survey)  --  such  coverage  measures  attempt  to  give 
the  tester  an  inductive  measure  of  confidence  that  PROP(P) 
has  been  determined.  We  will  discuss  several  of  these 
methods  rather  more  fully  in  the  sequel. 

The  reader  should  note  that  these  techniques  generally 
rely  in  one  way  or  another  on  proposition  (1)  --  they  at¬ 
tempt  by  inductive  or  deductive  means  to  allow  a  tester  to 
conclude  correctness.  But  correctness  is  a  very  strong 
property,  comprehending  for  instance  mathematical  equality 
of  infinite  functions.  It  is  rather  unlikely  that  efficient 
means  can  be  found  to  make  such  powerful  inferences. 

There  is  another  path  to  take,  however.  It  is  not  so 
well  travelled  because  it  is  less  scenic.  We  propose  to 
weaken  considerably  the  conclusion  of  (1),  to  replace  it  by: 

i.  P  is  correct 
or 

ii.  P  is  "pathological", 

where  "pathological"  will  have  a  well-defined  meaning,  which 
roughly  corresponds  to  P  possessing  an  empirically 
determined  char ac ter i st i c  which  places  it  outside  the  range 
of  programs  which  can  be  treated  in  this  way.  The  testing 
technique  determined  in  this  way,  we  call  mutation  analysis. 

In  carrying  out  this  plan  we  will  of  course  have  to 
sacrifice  some  of  the  elegance  of  the  techniques  based  on 
instances  of  (1),  but  we  hope  that  this  defect  is  balanced 
by  the  efficacy  of  mutation  analysis. 

The  sequel  is  organized  as  follows.  We  first  present 
the  basis  of  mutation  analysis,  relying  as  much  as  possible 
on  observable  assumptions  about  the  programming  process.  We 
then  describe  the  systems  which  have  been  constructed  for 
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conducting  mutation  analysis  of  Fortran  and  Cobol  programs. 
We  present  examples  typical  of  our  experience  with  these 
systems  by  means  of  several  "experiments".  Included  among 
these  experiments  will  be  some  evidence  for  believing  that 
mutation  analysis  is  useful  in  detecting  a  wide  variety  of 
errors  (via  the  coupling  effect  introduced  in  CDLS1 ])  .  In 
Section  6,  a  case  study  is  presented  of  the  use  of  mutation 
analysis  to  detect  errors  in  a  production  system  program;  it 
is  shown  in  this  study  how  test  data  can  be  strengthened  to 
locate  and  remove  subtle  errors.  Section  7  discusses  the 
relationship  of  program  mutation  to  error  seeding  and  logic 
circuit  fault  detection.  A  step  in  the  mutation  analysis 
process  involves  the  detection  of  certain  kinds  of  program 
equivalence;  Section  8  contains  a  complete  discussion  of 
this  equivalence  problem,  suggesting  some  efficient  al¬ 
gorithms  for  automatically  detecting  the  appropriate 
equivalences.  The  paper  closes  with  three  nonobvious  ap¬ 
plications  of  the  technique  to  issues  of  concern  in  software 
engineering . 


2.  MUTANTS  OF  A  PROGRAM 

In  [DLS1],  we  introduced  data  produced  by  Youngs  [You] 
that  strongly  hinted  that  the  errors  that  are  most  likely  to 
be  made  in  the  programming  process  are  simple,  classifiable 
errors.  We  have  been  lead  to  attempt  the  following 
generalization,  which  is  used  so  frequently  in  our  work  that 
we  have  given  it  a  name: 

The  Competent  Programmer  Assumption 
A  COMPETENT  PROGRAMMER,  AFTER  COMPLETING 
THE  ITERATIVE  PROGRAMMING  PROCESS  AND 
DEEMING  THAT  HIS  JOB  OF  DESIGNING, 

CODING  AND  TESTING  IS  COMPLETE,  HAS 
WRITTEN  A  PROGRAM  THAT  IS  EITHER  CORRECT 
OR  IS  ALMOST  CORRECT  IN  THAT  IT  DIFFERS 
FROM  A  CORRECT  PROGRAM  IN  "SIMPLE"  WAYS. 

Precisely  what  is  meant  by  "simple"  will  occupy  a 
considerable  amount  of  space  in  this  paper,  but  the 
intuitive  content  of  the  competent  programmer  assumption  is 
simply  that  competent  programmers  do  not  write  programs  at 
random;  if  the  program  produced  is  not  correct,  it  is  a 
program  with  bugs  and  can  be  edited  into  correct  form  by 
finding  and  fixing  the  bugs.  Suppose  that  the  task  at  hand 
is  to  design  a  Fortran  program  to  compute  the  (Euclidean) 
magnitude  of  an  N-d imensional  vector  X  in  a  Cartesian  coor¬ 
dinate  system  with  fixed  origin.  Then  the  subroutine  PI 
certainly  could  have  been  produced  by  a  competent  program¬ 
mer. 
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SUBROUTINE  P 1  t  X  .  MAG  ) 

MAG  =  1 

DO  1  I  =  1  .  N 

s  h 

MAG  =  MAGt-X  (I  )  *  *  2 

/  o  /  ^ 

MAG  =  SQRTIMAG) 

RETURN 

END  . 

We  would  question  the  competence  of  a  programmer  who 
produced  subroutine  P2: 

SUBROUTINE  P2(X,MAG) 

MAC.  :  X(1) 

DO  1  I  =  1  ,  N 
1  MAC.  =  MAX  (X  (  I  )  ,  MAG  ) 

RETURN 
END  . 

There  is  no  reasonable  sense  in  which  P2  is  a  "buggy"  ver¬ 
sion  of  the  program  asked  for.  PI  can  easily  be  debugged, 
but  P2  is  not  even  a  program  of  the  same  kind  —  it  is  so 
radically  incorrect  that  its  incorrectness  should  bo 
discovered  by  other  means. 

Suppose  that  we  now  try  to  inject  this  assumption  into 
proposition  (1)  and  try  to  discover  a  property  PROP  so  that: 

if  P  is  written  by  a  competent  programmer  (2 

and  PROP(P)  then  P  is  correct. 

This  is  a  considerable  change.  Proposition  (1)  in  its 
original  form  treats  a  program  as  a  random  object.  Proposi¬ 
tion  (2)  on  the  other  hand  attempts  to  exploit  something 
special  about  the  programming  process  (e.g.,  that  a  data 
processing  manager  expects  in  response  to  the  specifications 
for  a  personnel  system,  something  like  a  personnel  system; 
perhaps  incorrect,  inefficient  or  sloppy,  but  more  like  a 
personnel  system  than,  say,  a  missile  guidance  system). 

To  be  more  specific:  we  are  after  a  testing  method 
that  addresses  the  following  version  of  correctness  testing. 

Given  a  program  P  written  by  a  competent 
programmer,  find  a  test  data  set  for 
which  P  works  correctly  by  which  we  can 
infer  that  P  is,  with  high  probability, 
correct . 

Test  data  which  meets  this  criterion,  we  call  adequate  test 
data.  Under  the  competent  programmer  assumption  it  is  easy 
to  derive  some  simple  properties  that  adequate  test  data 
should  have.  We  can  observe  a  community  of  programmers  and 
in  principle  classify  the  errors  they  tend  to  make  into 
categor ies 


Wo  jr<*  {'roe  to  observe  the  programmers  for  as  long  as  we 
wish  and  make  whatever  specialized  assumptions  we  wish  about 
the  programming  task  they  will  be  called  upon  to  perform. 
Therefore  it  is  in  principle  possible  to  gain  whatever 
degree  of  confidence  we  desire  that  among  the  k  clas¬ 
sifications  we  have  countenanced  the  errors  most  likely  to 
be  made  by  this  particular  community.  Given  a  program  F  to 
test  in  this  setting,  we  must  derive  a n  a d e q u a t e  set  of  test 
data,  D,  for  P.  If  P  is  Incorrect  ,  we  will  never  be  able  to 
find  an  adequate  set ;  indeed,  the  point  of  testing  P  is  to 
find  a  set  of  test  data  that  calls  attention  to  the  fact 
that  P  is  incorrect.  If  P  is  correct,  however,  adequate  P 
should  at  least  convince  us  that  P  does  not  contain  the 
errors  most  likely  to  be  made. 

Let 

P  .  F  . P 

1  2  m 

differ  from  P  only  in  each  containing  a  single  error  chosen 
from  one  of  the  error  categories.  Then  an  adequate  set  of 
test  data  D  should  at  least  provide  least  provide  the  fol¬ 
lowing  assurance.  For  e  a  e  h  P  j  which  is  not  equivalent  to  P , 

P  *  t  D  )  *  P  j  •  (  D  ) 

In  other  words  for  each  of  the  most  likely  errors,  it  should 
be  possible  to  show  that  P  does  not  contain  that  specific 
e r ror  . 

Each  of  the  Pi’s  is  said  to  be  a  mutant  of  t  tie  program 
P.  Ttie  competent  programmer  assumption  states  ttiat  a 
program  is  assumed  to  be  either  correct  or  a  mutant  of  a 
correct  program.  For  example,  in  the  problem  of  computing 
magnitudes  of  N- vectors,  subroutine  PI  is  a  mutant  of  ttie 
c  o  r r e  c  t  P  below . 

SUBROUTINE  P(X.MAG) 

MAC.  =  0.0 
00  1  I  =  1  ,  N 
1  MAG  =  M AG  «-X  (I  )  *  *2 
MAG  :  SURT(MAG) 

RETURN 

END 

Subroutine  PC,  on  ttie  other  hand,  is  not  a  mutant  of  P. 

Mutation  analysis  is  a  method  of  eliminating  the  alter¬ 
natives  --  developing  a  set.  of  test  data  on  which  P  works 
correctly  but  on  which  all  mutants  of  P  fail  lor  in  our  sug¬ 
gestive  terminology,  "die"'.  Without  ttie  competent  program¬ 
mer  assumption,  there  would  be  infinitely  many  mutants  to 
consider,  but  even  witli  the  assumption,  practice  may  dictate 
so  m a n y  error  types  t h a t  this  method  is  i n t r a c t a b  1  e  .  In 
fact  ,  one’s  first  react  ion  upon  hearing  of  the  notion  is  to 
dismiss  it  as  an  obviously  intractable  and  therefore 
ridiculous  idea.  But  by  concentrating  only  on  "simple" 
mutants  of  P  the  technique  becomes  manageable.  For  example, 
PI  is  not  a  simple  mutant  of  P,  but  Ml  and  M.'  are: 


P  A  G  F  b 


SUBROUTINE  Ml (X.MAG) 
MAG  =  1 

DO  1  I =  1  ,  N 
1  MAG  =  M  A  G  +  X ( I  ) *  *  > 

MAG  =  SQRT(MAG) 

RETURN 

END 


SUBROUTINE  M2(X,MAG) 

MAG  =  0.0 
SO  1  1=  1  .  N 

MAG  =  MAG  +X  (  I  )  *  *2 
1  MAG  =  SQRT(MAG) 

RETURN 

END. 

The  mutants  we  will  consider  arise  from  the  single  applica¬ 
tion  of  a  mutant  operator  ,  a  simple  syntactic  or  semantic 
program  transformation  such  as  changing  a  particular 
instance  of  a  relational  operator  to  one  of  the  remaining 
operators  or  changing  the  target  of  an  unconditional  trans¬ 
fer  to  another  labelled  target.  We  will  also  refer  to 
mutant  operators  as  error  operators  .The  obvious  objection 
here  is  that  such  a  restriction  allows  one  to  do  little  more 
than  test  for  typographical  errors  in  programs,  perhaps 
useful,  but  hardly  worth  such  a  fuss.  As  we  will  discuss 
extensively  below  (Section  4.3)  there  is  an  observable 
"coupling"  of  simple  and  complex  errors  so  that  test  data 
that  causes  all  nonequivalent  simple  mutants  to  die  is  so 
sensitive  that  "likely"  complex  mutants  also  die.  The 
coupling  of  simple  and  complex  errors  implies  that  if  F  is 
correct  for  an  adequate  test  D  while  Ml  and  M2  die,  then  PI 
must  also  die  on  D. 

Observe  that  mutation  analysis  is  a  valid  principle 
(i.e.,  implements  correctness  testing)  if  the  competent 
programmer  assumption  is  valid  and  if  the  coupling  of  simple 
and  complex  errors  is  a  provable  effect.  In  practice 
(.theoretical  studies  not  withstanding  [BL1.BL2])  it  is  not 
necessary  to  show  formally  that  these  assumptions  hold  in 
order  for  mutation  analysis  to  be  a  useful  tool  for  testing 
real  programs.  It  is  sufficient  to  know  within  acceptable 
confidence  limits  when  the  assumptions  hold  and  to  work 
within  those  limits. 

We  have  found  that  in  performing  mutation  analysis  on 
an  incorrect  program,  the  tester  is  forced  to  develop  test 
data  on  which  his  program  fails  [BDLS].  So  we  are 
interested  in  building  interactive  systems  to  aid  program¬ 
mers  and  testers  in  performing  mutation  analysis  —  and  in 
so  doing,  evaluating  the  effectiveness  of  this  approach .  We 
pick  a  programming  language  L  (Fortran,  Cobol,  and  Lisp  have 
been  our  initial  choices)  and  --  based  on  prior  research  and 
other  experience  --  we  define  an  appropriate  set  of  mutant 
operators  for  L.  Then  we  build  a  interactive  mutation 
system  that  serves  as  a  test  harness  and  aids  in  performing 
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mutation  analysis.  Using  three  such  systems  we  have  for  the 
past  two  years  been  involved  in  the  testing  of  programs  js- 
ing  mutation  analysis  and  in  experiments  to  discover  when 
and  why  the  competent  programmer  assumption  holds  and  how 
simple  errors  can  be  coupled  to  complex  errors. 

Although  the  various  systems  we  have  constructed  differ 
in  certain  respects,  there  are  essential  sim  i  1 ar  i  t ies  .  The 
basic  design  was  discussed  in  an  earlier  paper  IBDLS], 
Briefly,  the  systems  allow  an  interactive  user  to  enter  a 
program  to  be  tested.  The  program  is  parsed  to  a  convenient 
internal  form  and  appropriate  data  files  are  created.  The 
user  then  enters  test  data,  executing  the  program  on  the 
test  data  in  typical  harness  fashion  to  check  for  errors. 
At  the  point  of  mutation  analysis,  the  user  "turns  on"  a 
subset  of  the  error  operators  and  the  system  then  creates  a 
list  of  mutant  description  records,  descriptions  of  how  the 
internal  form  is  to  bo  modified  to  create  the  required 
mutant.  The  changes  are  induced  sequentially  and  the 
modified  internal  form  is  interpreted,  the  results  being 
compared  to  the  original  results  to  determine  whether  or  not 
the  mutant  survives  the  execution  on  that  data.  At  the  com¬ 
pletion  of  the  pass,  summary  reports  are  presented  to  the 
user,  and  he  is  allowed  several  options  in  examining  the 
remaining  live  mutants  to  attempt  to  strengthen  his  test 
data.  The  user  may  also  declare  mutants  to  be  equivalent 
and  therefore  remove  them  from  future  consideration.  In  one 
of  our  systems  this  function  has  been  partially  automated 
with  considerable  improvement  in  performance.  The  issue  of 
equivalent  mutants  will  be  discussed  more  fully  in  a  later 
sect  ion  . 

Part  of  our  early  experience  with  mutation  systems  was 
the  testing,  using  the  first  Fortran  system  FMS.1,  of  the 
statement  scanner  of  FMS.1  itself.  In  elapsed  time,  the 
nearly  o  ,  o  0  0  mutants  were  completely  analyzed  in  six  man¬ 
hours,  using  approximately  14  opu  minutes  of  a  slow  PDF-10 
k A -  1 0  processor  running  the  T 0 P S 1 0  timesharing  operating 
system.  A  more  compete  description  of  this  an  lysis  is 
available  in  [BPLS].  We  will  return  to  the  question  of  the 
efficiency  of  mutation  analysis  in  the  Section  4. 

3.  THE  MUTATION  SYSTEMS 

3.1  Fortran,  In  the  fall  of  1977,  a  pilot  mutation 
system  for  a  subset  of  Fortran  became  operational  on  a  PPP- 
10  computer  at  Yale  University.  This  is  the  PIMS  system 
discussed  in  detail  in  [DDLS];  in  anticipation  of  several 
versions  of  mutation  s  y  s  t  e  m  s  for  several  different  languages 
we  have  since  adopted  the  following  naming  conventions  for 
our  systems.  A  system  is  denoted  by  a  string 

v  lang'>MS.<version>  , 

where  <langN  is  a  unique  identification  for  the  language 
(e.g.,  F  for  Fortran!  and  tversion'  is  a  chronological  ver¬ 
sion  number.  Thus,  PIMS  is  the  system  FMS.1.  Subsequently, 
F M S .  1  was  implemented  on  a  DEC  K L - 2 0  at  Yale,  and  a  PRIME 


PAGE  $ 


4  00  at 
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became  a 
i  n  g  the 


Georgia  Tech.  Although  the  Fortran  subset  required 
is  restrictive,  it  has  been  large  enough  to  permit 
body  of  experience  with  mutation  analysis  to  ac- 
( see  the  experiment  in  [BSLSJ  and  Section  6,  for 


restricted  language  accepted  by  F M S . 1  eventually 
bottle,  -ek  for  the  experimenters.  Therefore,  dur- 
year  1973-1979,  an  expanded  Fortran  system,  FMS. 2 
(the  system  sometimes  referred  to  as  EXPER)  was  constructed. 
FtlS.2  accepts  any  ANSI  Fortran  program  which  does  not  use 
complex  arithmetic  or  input/output  statements  (for  programs 
which  do  not  meet  this  restriction,  recoding  must  replace 
i n  p  u  t / o  u  t  p  u  t  statements  by  array  assignments).  F  M  S . 2  is 
fully  operational  on  the  DEC  KL-20  at  Yale  and  is  being  im¬ 
plemented  on  the  VAX- 11  at  Berkeley.  While  the  overall 
goals  of  the  Fortran  systems  are  similar,  FMS. 2  differs  from 
FMS.1  in  several  important  respects.  FMS.1  was  designed 
with  user-oriented  features  in  mind;  it  was  anticipated  that 
testers  unfamiliar  and  unsympathetic  with  the  system  would 
be  the  primary  user  community.  FMS. 2,  on  the  other  hand, 
was  designed  primarily  as  an  experimental  device  for  the 
mutation  research  groups,  to  facilitate  experiments  into  how 
mutation  analysis  can  be  integrated  into  the  design  coding 
and  testing  of  multi- module  programs,  experiments  into  the 
sufficiency  of  various  sets  of  mutant  operators  and  for 
various  experiments  surrounding  the  coupling  effect  and  the 
overall  effectiveness  of  the  mutation  approach. 


FMS. 2  sessions  are  organized  around  the  concept  of  an 
experiment  .An  experiment  consists  of  a  program,  test  data, 
and  a  subset  of  the  error  operators  which  may  be  applied  to 
the  program.  The  experimenter  is  more  easily  able  to 
generate  small  variations  in  each  of  these  elements  and 
monitor  the  progress  of  subjects  using  FMS. 2  to  perform  the 
mutation  analysis.  As  with  FMS.1,  this  system  responds  with 
summaries  and  reports  on  the  number  and  type  of  mutants 
which  remain  alive,  so  that  tile  user  can  augment  his  tests. 

The  basic  set  of  error  operators  supplied  by  FMS. 2  are 


Data  reference  Mutations 

1.  Constant  Replacement  (by  +1,  -!) 

2.  Scalar  for  Constant  Replacement 
}.  Source  Constant  Replacement 

4.  Array  Reference  for  Constant  Replacement 

5.  Scalar  Variable  Replacement 

6.  Constant  for  Scalar  Replacement 

7.  Array  Reference  for  Scalar  Replacement 

8 .  Comparable  A  r  r  a  y  Name  Replacement 

9 .  Constant  for  Array  Reference  R  e  p 1  a o em e  n  t 

10.  Sea  •  for  Array  Reference  Replacement 

11.  Art -j  Reference  for  Array  Reference  Replacement 

Operator  Mutations 


12.  Arithmetic  Operator  Replacement 
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1j.  Relational  Operator  Replacement 

14.  Logical  Connective  Replacement 

15.  Unary  Operator  Replacement 

16.  Unary  Operator  Removal 

17.  Unary  Operator  Insertion 

Statement  Mutations 

18.  Statement  Analysis  (C-1  Path  analysis) 

19.  Statement  Deletion 

20.  Return  Statement  Replacement 

Control  Structure  Mutations 

21.  Jump  Statement  Replacement 

22.  DO  statement  Replacement 

3.2  Cobol  .  The  design  of  the  Cobol  mutation  system 
CMS.1  is  based  on  the  original  design  of  FMS.1.  The  reader 
will  get  an  idea  of  the  way  in  which  CMS.1  interacts  with 
users  by  consulting  the  corresponding  descriptions  for  FMS.1 
in  [BDLS].  CMS.  1  accepts  a  simple  subset  of  the  Cobol 
language  and  supports  up  to  ten  rewindable  input  files  and 
ten  non-rewindable  output  files.  This  has  been  found  to  be 
adequate  for  a  variety  of  data  processing  tasks  and  should 
allow  the  analysis  of  a  large  selection  of  Cobol  programs. 
CMS.1  is  currently  implemented  on  a  PRIME  400  computer  at 
Georg ia  Tech . 

Mutants  are  said  to  exhibit  equivalent  behavior  if  they 
produce  the  same  output  records  as  the  original  program. 
Mutants  may  fail  by  producing  different  output,  or  by  a  run¬ 
time  error  such  as  referencing  undefined  data,  referencing 
nonnumeric  data  in  a  numeric  i.struction,  trying  to  use  a 
file  unit  that  is  not  open,  etc. 

As  might  be  expected,  the  introduction  of  input/output 
and  data  structuring  capabilities  create  special  problems 
for  CMS.1  not  encountered  in  the  Fortran  systems.  The  fol¬ 
lowing  are  the  error  operators  which  appear  to  be  unique  to 
the  Cobol  language. 

1.  Move  implied  decimal  point  in 
numeric  items  one  place  to  the  left 
or  to  the  right. 

2.  Add  or  subtract  one  from  an  OCCURS 
clause  count. 

3.  Insert  FILLER  of  length  one  between 
two  adjacent  record  items;  also 
change  FILLER  lengths  by  one. 

4.  Reverse  adjacent  elementary  items  in 
records  . 

5.  Alter  file  references. 

6.  Switch  PERFORMS  and  GOTOs  . 
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7.  Change  ROUNDED  to  truncation  and 
vice-versa  . 

8.  Change  the  sense  of  a  MOVE. 


The  remaining  error  operators  include  the  operator 
replacements  and  control  flow  mutations  that  are  described 
above.  As  primitive  as  this  subset  of  Cobol  appears,  it  is 
adequate  for  broad-bused  experimentation,  including  the 
analysis  of  many  production  Cobol  programs  supplied  to  the 
mutation  research  group  by  external  sources. 

CMS.1  is  unique  in  another  respect.  While  some  module 
testing  of  FMS.1  and  FMS.2  was  carried  out  by  the  design 
teams,  access  to  reasonable  subsets  of  the  implementation 
languages  was  limited  by  the  concerns  detailed  above. 
CMS.1,  on  the  other  hand  is  being  tested  extensively  using 
the  FMS.2  system  at  Yale. 

The  Appendix  contains  essentailly  a  script  of  a  CMS.1 
session  on  a  production  Cobol  program  drawn  from  the  US  Army 
personnel  system  SIDPERS.  The  program  has  been  modified 
somewhat,  mainly  in  the  reduction  of  the  record  sizes  to 
make  a  better  CRT  display.  The  program  takes  as  input  two 
files,  representing  and  old  backup  tape  and  a  new  one.  The 
output  is  a  summary  of  the  changes.  The  input  files  are  as¬ 
sumed  to  be  sorted  on  a  key  field.  The  program  is  1 30  lines 
long  and  has  1195  mutants,  of  which  37  are  easily  seen  to  be 
equivalent  to  the  original  program.  Initially  ten  test 
cases  were  generated  to  eliminate  all  of  the  nonequ i v al en t 
mutants.  Subsequently  a  subset  of  five  test  cases  was  found 
to  be  adequate  for  the  task.  The  entire  run  took  about  7 
minutes  of  clock  time,  and  2  minutes  and  45  seconds  of  CPU 
time  on  the  PRIME  400. 
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il.  THE  COMPLEXITY  OE  MUTATION  ANALYSIS 


At  first  blush,  it  would  seem  that  there  is  a  severely 
limitative  trade-off  at  work  in  the  technique  described  in 
the  previous  section.  In  order  to  be  efficient,  the  number 
of  distinct  mutants  must  be  kept  rather  small.  But  the  list 
of  potential  errors  (rather,  the  list  of  error  operators)  in 
order  to  be  realistic  must  be  quite  extensive.  Apparently, 
then,  if  we  try  to  constrain  the  number  of  mutants  of  an  N 
statement  program  to  some  reasonable  size  --  say,  p(N),  for 
a  "small"  polynomial  p*  --  mutation  analysis  loses  its  ef¬ 
fect  as  a  realistic  model  of  the  programming  process.  If  on 
the  other  hand  we  try  to  build  into  the  analysis  all  of  the 
possible  error  types  which  we  can  expect  to  encounter,  then 
the  number  of  mutants  associated  with  an  N  statement  program 
need  not  be  bounded  by  any  reasonable  function  of  N. 

In  this  section  we  will  show  how  the  choice  of  the  first 
alternative  in  the  tradeoff  is  justified.  In  fact,  an  N 


statement  program  --  on  the  average  --  will  generate  only 

polynomially  many  mutants,  most  of  which  are  unstable  and  £ 

die  in  the  analysis  stage  very  quickly.  A  "coupling  effect" 

is  invoked  to  save  the  method  from  only  being  capable  of  fj 

dealing  with  trivial  errors,  and  we  will  report  on  some  P 

preliminary  experimental  evidence  for  our  belief  in  the 
coupling  effect. 


•This  seems  reasonable.  Polynomial  growth  in  complexity  in 
the  analysis  of  algorithms  is  generally  identified  with  com¬ 
putational  tract ability.  In  testing  for  correctness  or  in 
program  verification,  even  subcases  which  are  solvable  tend 
to  be  of  nonpolynomial  complexity  (usually  exponential  or 
worse)  . 
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4.1  The  Number  of  Mutants.  Youngs'  data  and  several 
less  widely  reported  but  related  studies  [TRW, Gil]  suggest 
very  strongly  that  the  errors  that  tend  to  occur  in  programs 
are  relatively  simple  errors.  To  be  precise,  let  us  define 
a  simple  mutant  as  follows.  Let  P  be  a  program  written  in  a 
programming  language  defined  by  a  grammar  G,  and  let  par- 
se(P)  be  the  syntax  tree  for  P  obtained  by  parsing  P  accord¬ 
ing  to  G.  Then  a  1-order  simple  mutant  operator  ER  is  a 
function  mapping  a  parse  tree  T  to  a  tree  ER(T)  so  that  T 
and  ER(T)  differ  by  at  most  one  terminal  node  (i.e.,  leaf). 
ER(T)  is  said  to  be  a  simple  1-order  mutant  of  T.  Proceed¬ 
ing  inductively,  a  k-order  mutant  is  simply  a  k-fold  itera¬ 
tion  of  1-order  mutants.  In  particular,  notice  that  simple 
mutants  do  not  alter  the  "semantic  structure"  of  a  program 
--  that  is  they  do  not  modify  the  internal  nodes  of  the  par¬ 
se  tree.  The  error  operators  designed  for  the  automated 
systems  are  with  few  exceptions  simple  1-order  mutants. 

We  will  first  give  a  heuristic  analysis  of  the  expected 
number  of  mutants  of  a  program  as  a  function  of  several  size 
parameters.  The  list  of  mutant  operators  for  FMS.1  and 
FMS.2  is  relatively  unsophisticated  and  has  undergone  little 
revision  that  would  improve  the  number  of  generated  mutants 
(CMS.1  by  contrast  has  a  rather  more  streamlined  mutant 
generation  system),  so  our  analysis  is  not  biased  in  favor 
of  simple  mutants. 

First,  it  is  possible  to  derive  an  order-of-growth  ex¬ 
pression  for  the  number  of  FMS.1  mutants.  Data  reference 
replacements  are  accomplished  by  interchanging  reference 
names  occurring  within  the  program.  In  a  program  with  N 
statements  and  K  distinct  data  references  this  number  is 

2 

F( N ,K ) =0  (K  )  . 

The  reader  can  convince  himself  (cf.  [Kn])  that  for  each  of 
the  constant  and  operator  replacement  schemes  there  is  a 
constant  c  so  that  the  number  of  generated  mutants  is  boun¬ 
ded  by  cK.  Therefore,  F(N,K)  is  the  dominant  term,  and  the 
number  of  generated  mutants  is  in  the  worst  case  quadratic 
in  the  number  of  distinct  data  references. 

Observations  of  typical  programs  lead  to  an  even  more 
favorable  estimation  of  the  expected  number  of  mutants 
generated  under  FMS.2.  In  programs  that  are  not  maliciously 
dense  (for  an  example  of  such  a  dense  program  see  [LS]) 
F(N,K)  is  more  closely  approximated  by 

F  ( N  , K ) =  0  (NK  ) 

while  in  typical  programs,  such  as  those  discovered  by  Knuth 
[Kn]  the  data  references  tend  to  be  so  sparsely  distributed 
that  the  rate  of  growth  is  usually  closer  to  quadratic  in  N: 

2 

F(N.K)  »(H)  . 
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In  generating  mutants  of  Cobol  programs,  it  is  possible 
to  more  nearly  approach  linear  growth,  since  the  number  of 
data  reference  interchanges  is  limited  by  syntactical  redun¬ 
dancies.  In  fact,  an  analysis  similar  to  the  one  carried 
out  above  gives  the  worst  case  estimate  for  the  expected 
number  of  mutants  for  a  Cobol  program  as  the  number  of  data 
division  lines  multiplied  by  the  number  of  procedure 
division  lines.  For  typical  Cobol  programs  this  estimate  is 

2 

C  (  N  .  K )  <<  N. 

Figures  1  and  2  show  mutant  growth  rates  for  a  sampling  of 
Fortran  and  Cobol  programs.  Notice  that  in  both  cases  (ex¬ 
cept  for  the  variation  in  small  Fortran  programs)  the 
estimates  given  above  are  generous  upper  bounds  on  the  ob¬ 
served  number  of  mutants.  In  experiments  using  CMS.1,  we 
have  found  the  average  growth  rate  for  "production"  Cobol 
programs  to  be  more  nearly  linear  in  the  product  of 
procedure  division  lines  and  K  than  quadratic  in  N. 
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4 . 2  Mutant  Instability. 
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4.1  The  Coupling  Effect.  Using  only  l  he  mutant  ope.)  tors 
defined  above,  it  would  seem  likely  that  a  program  that  had 
been  successfully  subjected  to  nut  at  ion  analysis  might  still 
e  o  n  t  a  i  n  s  o m  e  e  on  p  1  e  x  e  r  r  o  r  s  ,  errors  w  h  i  e  li  a  r  e  ti  o  t  explicit 
mutants  of  the  program  and  are  not  dist inguished  by  the  test 
data.  In  l DLS 1  1  ,  we  proposed  a  "coupling  effect"  which  as¬ 
serted  the  existence  of  significant  classes  of  programs  for 
which  such  omissions  are  rare;  briefly  stated,  the  coupling 
e  f  f e  o  t  a  s  se r  t  s  : 


The  Coupling  Effect 

TEST  DATA  ON  WHICH  ALL  SIMPLE  MUTANTS 
FAIL  IS  SO  SENSITIVE  TO  CHANGES  IN  THE 
PROGRAM  TiiAT  IT  IS  LIKELY  THAT  ALL  COM¬ 
PLEX  MUTANTS  MUST  ALSO  FAIL. 

Note  that  there  is  no  claim  that  the  coupling  effect  is 
a  provable  phenomenon  in  a  mathematical  sense;  indeed,  there 
are  very  simple  counterexamples  to  it  .  It  is  however,  a 


1  (' 
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useful  principle  that  can  be  observed  to  hold  for  broad 
classes  of  programs.  We  come  therefore  to  consider  on  what 
evidence  we  believe  in  the  coupling  effect. 

First,  we  know  that  there  is  a  provable  coupling  effect 
for  certain  restricted  models  of  computation.  In  [ BL  1  1  ,  the 
following  was  proved:  Let  P  be  a  complete  decision  table 

program  (i.e.,  one  missing  no  actions  or  conditions),  and 
let  P  evaluate  correctly  on  test  data  that  is  adequate  under 
the  following  mutant  operators 

Replace  any  condition  by  "don't  care" 

Complement  any  condition 

Replace  any  "don't  care"  by  "yes"  and  "no" 

Delete  any  action 
Add  any  action; 
then  P  is  correct. 

It  has  also  been  conjectured  that  a  provable  coupling 
effect  can  be  exhibited  for  several  other  formally  interest¬ 
ing  classes  of  programs,  such  as  pure  Lisp  functions  and 
linear  recursive  schemes  (cf  [PL. 2]). 

Second,  there  is  a  great  variety  of  observational 
evidence  for  the  coupling  effect.  Invest  igators  using  the 
DAVE  test  data  generation  system  at  the  University  of 
Colorado,  for  example,  have  reported  that  even  using  a 
restricted  set  of  error  operators  the  ability  to  detect  sim¬ 
ple  errors  is  oftentimes  useful  in  insuring  against  more 
complex  errors  [  0 F  1  ,  OF  2  ] . 

Third,  there  is  a  growing  experimental  understanding  of 
the  coupling  effect  in  functioning  programs.  We  give  here 
an  example  of  the  empirical  evidence.  The  subject  program 
is  Hoare's  FIND  program  [Hoa].  As  described  in  [DLS1],  FIND 
was  used  in  the  following  experiment. 

1.  A  test  data  set  of  49  cases  was  derived  and 
shown  to  be  adequate. 

2.  The  test  data  set  from  1  was  heuristic ally 
reduced  to  a  set  of  7  test  cases  which  also 
turned  out  to  be  adequate. 

j.  Random  simple  k-order  mutants  were  selected 
(  k  >  1  )  . 

4.  The  higher  order  mutants  of  step  1  were 
executed  on  the  reduced  test  data  set. 

It  would  be  evidence  against  the  coupling  effect  if  it  was 
possible  to  randomly  gene  r  ate  v  e  r  y  m  any  high  e  r  o  r  d  e  r  n  o  ti  - 
equivalent  mutants  on  which  the  reduced  test  data  set 
behaved  in  a  manner  indistinguishable  from  FIND.  Notice 
that  Step  2  biases  the  experiment  against  the  coupling  ef¬ 
fect  since  it  removes  the  man-machine  orientation  of  muta¬ 
tion  analysis.  We  concentrated  first  on  the  case  k -  2 , 
reasoning  that  the  larger  the  value  of  k,  the  more  one 
violates  the  competent  programmer  assumption,  with  the  fol- 
lowing  results: 

Number  of  2-order  mutants  21, 100 

Number  indistinguishable  from  FIND  Id 

Number  equivalent  to  FIND  19. 
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However,  a  more  limited  analysis  of  still  higher  order 
mutants,  still  failed  to  reject  the  coupling  effect: 

Number  of  k-order  mutants  (k>2)  1,500 
Number  indistinguishable  from  FIND  0. 

A  major  defect  in  this  experiment  can  be  brought  to 
light  by  considering  the  following  conceptual  basis  for 
error  coupling.  Just  as  the  competent  programmer  assumption 
states  that  programs  are  not  written  at  random,  the  coupling 
effect  is  implied  by  the  fact  that  program  statements  are 
not  composed  at  random;  indeed,  there  is  considerable  flow 
and  sharing  of  information  between  statements  of  a  program, 
so  that  a  change  to  one  portion  of  a  program  is  likely  to 
have  observable,  albeit  subtle,  effects  on  its  global 
context.  Now  for  the  problem  with  this  experiment:  the 
k-order  mutants  are  chosen  randomly  and  by  independent 
drawings  of  1-order  mutants.  Therefore  the  resulting 
higher-order  mutant  is  very  unstable  and  subject  to  quick 
failure.  The  experiment  should  also  be  conducted  when  the 
higher-order  mutants  contain  subtly  related  errors.  To  this 
end,  the  experiment  was  repeated  using  the  following 
replacement  for  step  3: 

3':  Randomly  generate  correlated  k-order 
mutants  of  the  program. 

In  Step  3',  correlated  means  that  each  of  the  k  applications 
of  1-order  mutant  operators  will  be  related  in  some  way  to 
all  of  the  preceding  applications,  all  affecting  the  same 
line,  for  example.  As  before,  if  a  program  is  successfully 
subjected  to  mutation  analysis  on  a  test  data  set,  then  the 
coupling  effect  asserts  that  the  correlated  k-order  mutants 
are  also  likely  to  fail  on  the  test  data. 

In  addition  to  FIND,  we  use  the  program  STKSIM  which 
maintains  a  stack  and  performs  the  operations  clear,  push, 
pop ,  and  top . 

Figure  3  contains  a  summary  of  the  results  of  the  ex¬ 
periment.  Although,  much  careful  experimentation  under  more 
stringent  statistical  analyses  must  be  carried  out,  there  is 
probably  enough  information  to  conclude  that  there  is  a 
meaningful  sense  in  which  errors  are  coupled  by  an  ap¬ 
propriate  choice  of  error  operators. 
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PROGRAM  NUMBER  NUMBER  NUMBER  NUMBER  NUMBER  NUMBER 

NAME  GENERATED  ALIVE  GENERATED  ALIVE  GENERATED  ALIVE 


k  =  2 

k  = 

3 

k  =  4 

FIND 

3000  2 

3000 

0 

3000  0 

STKSIM 

3000  3 

3000 

0 

3000  0 

Figure  3. 

Correlated 

k-order 

Mutants 

The  results  are  for  the  most  part  self  explanatory.  All  of 
the  live  correlated  k-order  mutants  described  in  the  table 
have  been  shown  equivalent  by  rather  simple  arguments. 

Although  we  have  attempted  no  thorough  statistical 
analyses  of  these  experiments,  the  size  of  the  samples 
(nearly  50,000  combined  correlated  and  uncorrelated  mutants) 
is  certainly  large  enough  to  sustain  statistically  sig¬ 
nificant  conclusions  assuming  a  variety  of  underlying  models 
and  distributions. 

Less  formal  but  nevertheless  striking  evidence  is  of  the 
"testimonial"  variety.  Since  1976  we  have  conducted  muta¬ 
tion  analysis  sessions  on  perhaps  several  hundreds  of 
Fortran,  Cobol ,  and  Lisp  programs.  So  many  instances  of  the 
coupling  of  simple  and  complex  errors  have  been  observed 
over  such  a  wide  range  of  programs  that  it  is  likely  there 
is  an  observable  effect  at  work. 

4.4  Reducing  Complexity.  Even  with  all  of  the  forego¬ 
ing  reduction  techniques,  current  technology  places  the 
bounds  of  practicality  for  monolithic  programs  somewhere  in 
the  5,000  to  10,000  line  range  for  Fortran  and  somewhat 
higher  for  Cobol  programs.  Even  this  must  be  treated  as  an 
optimistic  upper  limit  --  certainly  the  technique  is  not 
easy  to  apply  at  the  5,000  statement  level.  A  speculative 
but  not  unjustifiable  technique  is  to  use  Monte  Carlo  tech¬ 
niques  to  sample  from  large  populations  of  mutants.  A  sim¬ 
ple  argument  to  support  such  an  analysis  can  be  had  via  the 
following  Gedanken  experiment.  Let 

f(  x) 

appear  in  a  specific  context  of  a  program  undergoing  muta¬ 
tion  analysis;  if  a  set  of  test  data  is  too  weak  for  the 
program  but  the  program  is  nevertheless  correct,  then  there 
is  an  adequate  set  of  test  data  ,  D,  on  which 

[  f  (  x )  ]  *  (  D  )  t  [  f  (  x  ’  )  ]  *  ( D  )  , 

where  x'  is  some  specified  data  reference  replacement  muta¬ 
tion  of  x.  But  x  and  x'  in  these  expression  are  BOUND 
variables;  it  only  matters  that  they  refer  to  distinct 
positions  of  a  state  vector  which  has  been  specially 
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constructed  to  exhibit  the  inequality.  In  other  words  it  is 
important  that  we  are  able  to  "explain"  with  test  data  why  x 
is  an  argument  of  f,  but  perhaps  less  important  that  we  be 
able  to  explain  why  the  argument  is  not  x'  or  any  other 
specific  alternative.  But  this  can  be  accomplished  by  sam¬ 
pling  from  enough  alternative  choices  x'  to  insure  that 
identities  that  we  are  observing  are  not  mathematical.  If 
the  functions  involved  are  at  all  well-behaved  algebraically 
then  algebraic  identities  can  be  discerned  in  this  way  (see 
[DL]  for  simple  cases).  In  one  experiment,  mutation 
analysis  on  only  10  percent  of  the  total  mutant  population 
resulted  in  test  data  strong  enough  to  kill  95  percent  of 
the  entire  mutant  population. 

If  reliable  patterns  can  be  found  by  such  sampling 
techniques  then  the  range  of  programs  which  can  be  analyzed 
is  expanded  by  an  order  of  magnitude.  We  anticipate  report¬ 
ing  on  this  research  elsewhere. 

There  is  an  obvious  method  which  will  further  reduce 
the  amount  of  time  needed  to  process  mutants.  Since 
mutants,  once  generated  are  entirely  independent  entities, 
copies  of  mutant  description  records  may  be  distributed 
among  several  computers  for  parallel  execution.  It  is 
feasible  to  decrease  running  times  by  amounts  dependent  only 
on  the  amount  of  computer  resources  one  is  willing  to  invest 
in  the  analysis. 


5.  ERROR  OPERATORS  FOR  CLASSES  OF  ERRORS 

Of  course  the  whole  point  of  program  testing  and 
therefore  mutation  analysis  is  to  detect  errors  in  programs 
that  are  not  correct.  So  far  we  have  given  no  evidence  that 
mutation  analysis  is  a  useful  tool  in  this  regard.  In  this, 
and  in  the  following  section,  we  will  indicate  our  current 
state  of  knowledge  in  this  regard.  First,  we  will  describe 
a  wide  class  of  error  types  and  show  by  example  how  the 
error  operators  which  are  currently  implemented  are  useful 
in  detecting  errors  of  those  types.  Second  --  in  the  fol¬ 
lowing  section  --  we  will  describe  a  case  study  of  the 
uncovering  of  a  resistant,  complex  error  in  a  production 
system  using  mutation  analysis. 

5.1  Simple  Errors.  If  the  program  contains  a  simple 
error,  then  one  of  the  mutants  generated  by  the  system  will 
be  correct.  The  error  will  be  discovered  when  an  attempt  is 
made  to  eliminate  the  correct  program  since  its  behavior 
will  be  correct  but  the  progam  being  tested  will  give  dif¬ 
fering  results.  If  the  program  contains  simple  k-order 
errors  that  are  relatively  independent  and  each  error  is  ex¬ 
posed  by  a  single  mutant,  then  the  errors  will  also  be 
detected  (see  Section  6  for  an  example). 

5.2  Dead  Statements.  As  described  by  Huang  [Hua],  many 
programming  errors  manifest  themselves  in  "dead  code",  that 
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is,  source  statements  that  are  unexecutable  or,  more 
seriously,  give  incorrect  results  regardless  of  the  the  data 
presented.  Such  errors  may  persist  for  weeks  or  even  years 
if  the  errors  lie  in  rarely  executed  portions  of  the 
program  . 

It  is  therefore  a  reasonable  first  goal  in  testing  a 
program  to  insist  that  each  statement  be  executed  at  least 
once.  Typical  methods  for  achieving  this  goal  include  for 
example  the  insertion  of  instruction  counters  into  straight 
line  segments  of  the  program,  so  that  a  non-zero  vector  of 
counters  indicates  that  the  instrumented  statements  have  all 
been  executed  at  least  once. 

During  mutation  analysis,  the  goal  outlined  above  will 
be  viewed  from  a  slightly  different  perspective.  If  a 
statement  cannot  be  executed,  then  clearly  we  can  change  the 
statement  in  any  way  we  want,  and  the  effects  of  the  changes 
will  not  be  noticable  as  the  program  runs  --  in  particular 
the  altered  program  will  not  be  distinguishable  in  its  out¬ 
put  behavior  from  the  original  one.  There  is,  however,  a 
mutant  operator  which  draws  the  tester's  attention  to  this 
situation  in  a  more  economical  way.  Among  the  mutants  are 
those  which  replace  in  turn  the  first  statement  of  every 
basic  block  by  a  call  to  a  routine  which  aborts  the  run  when 
it  is  executed.  Such  mutations  are  extremely  unstable  since 
any  data  which  causes  the  execution  of  the  replaced 
statement  will  also  cause  the  mutant  to  produce  incorrect 
results  and  hence  to  be  eliminated.  The  converse  is  also 
true.  That  is,  if  any  of  these  mutants  survives  the 
analysis  then  the  altered  statement  has  never  been  executed. 
Therefore,  accounting  for  the  the  survival  of  these  mutants 
gives  important  information  about  which  sections  of  the 
program  have  been  executed. 

This  analysis  shows  why  apparently  useful  testing 
heuristics  can  lead  one  astray.  For  example,  it  has  been 
suggested  [Ham]  that  not  executing  a  statement  is  equivalent 
to  deleting  it,  but  this  discussion  show  how  such  a  strategy 
can  fail.  A  statement  can  be  executed  and  still  serve  no 
useful  purpose.  Suppose  that  we  replace  every  statement  by 
a  convenient  NO-OP  such  as  the  Fortran  CONTINUE.  The  sur¬ 
vival  or  elimination  of  such  mutants  gives  more  information 
than  merely  whether  or  not  the  statement  has  been  executed. 
It  indicates  whether  or  not  the  statement  has  any  observable 
effect  upon  the  output.  If  a  statement  can  be  replaced  by  a 
NO-OP  with  no  observable  effect,  then  it  can  indicate  at 
best  that  machine  time  is  wasted  in  its  execution  (possibly 
a  design  error)  and  at  worst  a  much  more  serious  error. 

Insuring  that  every  statement  is  executable  is  no 
guarantee  of  correctness  [GG.Howl].  Predicate  errors  or 
coincidental  correctness  may  pass  undetected  even  if  every 
statement  is  successfully  executed.  We  will  return  to  these 
errors  types  later  in  this  section. 

5.3  Dead  Branches.  It  has  been  noted  (see  [Hua])  that 
an  improvement  over  simply  analyzing  the  execution  of 
statements  can  be  had  by  analyzing  the  execution  of 
branches,  attempting  to  execute  every  branch  at  least  once. 
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For  example,  the 


program  segment 


has  the  flowchart  shown  in  Figure  4 
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Figure  4 . 
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All  statements  A,B  and  C  can  be  executed  by  a  single 
test  case.  It  is  not  true  however  that  in  this  case  all 
branches  have  been  executed.  In  this  example  the  empty  else 
clause  branch  can  be  bypassed  even  though  A,B  and  C  are 
executed . 

However,  the  requirement  that  every  branch  be  traversed 
can  be  restated:  every  predicate  must  evaluate  to  both  TRUE 
and  FALSE.  The  latter  formulation  is  used  in  mutation 
analysis.  There  are  error  operators  to  replace  each  logical 
expression  by  boolean  constants.  Like  the  statement 
analysis  mutations  described  above,  these  mutations  tend  to 
be  unstable  and  are  easily  eliminated  by  almost  any  data. 
If  these  mutants  survive,  they  point  directly  to  a  weakness 
in  the  test  data  which  might  shield  a  possible  error. 

Mutating  each  relation  or  each  logical  expression  in¬ 
dependently  actually  achieves  a  stronger  test  than  that 
achieved  by  the  usual  techniques  of  branch  analysis.  For 
consider  the  compound  predicate 

IF( A.LE.B.AND.C.LE.D)THEN  ... 

Simple  branch  coverage  requires  only  two  test  cases  to 
test  the  predicate.  But  suppose  that  the  test  points  for 
the  covering  test  are 


and 


A  <  B  and  C<  D 


A  <  B  and  C>  D. 

These  points  have  the  effect  of  only  testing  the  second 
clause.  This  kind  of  analysis  fails  to  take  into  account 
the  hidden  paths  fTLSI]  implicit  in  compound  predicates  (see 
Figure  5).  In  testing  all  the  hidden  paths,  mutation 
analysis  requires  at  least  three  points  to  test  the 
predicate,  corresponding  to  the  branches  (A>B,C>D), 
(A<<B,C>D),  and  (A£<B,C<<D). 
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As  a  more  concrete  example,  consider  the  program  shown 
in  Figure  6.  This  program  is  adapted  from  [Gel]  and  was 
studied  in  [OW];  it  is  intended  to  calculate  the  number  of 
days  between  two  given  dates.  The  predicate  which 
determines  whether  a  year  is  a  leap  year  is  incorrect. 
Notice  that  if  year  the  year  is  divisible  by  400  (i.e.,  if 
year  REM  400  =  0)  it  is  necessarily  divisible  by  100  (ie, 
year  REM  100  =  0).  Therefore  the  logical  expression  formed 
by  the  conjunction  of  these  clauses  is  equivalent  to  the 
second  clause  alone.  Alternatively  the  expression  year  REM 
100  =  0  can  be  replaced  by  the  logical  constant  TRUE  and  the 
resulting  mutant  is  equivalent  to  the  original  program. 
Since  it  is  not  obvious  what  the  programmer  had  in  mind,  the 
error  is  discovered.  Notice  also  that  mutation  analysis 
shows  that  the  assignment  d a y s i n ( 1 2  )  : = 3  1  is  redundant  and 
can  be  removed  from  the  program. 
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l’  N  0  C  E 1'  U  R  K  e  a  l  e  n  darUN  T  K  G  E  K  V  A  L  U  F.  day  1  .month  1,  d  a  y,',month.’(y  o  a  r  )  ; 
B  E  G  I  N 

INTEGER  days 

IF  month,’ -month  1  THEN  d a y s  =  d a y s?-d a ys  1 

COMMENT  if  the  dates  are  in  the  same  month,  then 

we  ean  compute  the  number  of  days  directly; 

ELSE 
B  EG  I  N 

INTEGER  ARRAY  d  a  y  s  l  n  (  1  .  .  1 ) 

d  a  y  s i n ( 1  1  :  =  1  1  ;  d a  y  s i n ( 3  )  : =31 ;daysin(4 ) :  =  30; 
daysin(‘3):  =  ll;daysin(6):=30;daysin(7):=3l; 
d  a  y  s i n ( 8 ) :  =  1  1 ;daysln(9 1 :  =  3  0 ; d a  y  s i n ( 10) : =  <1  ; 
d  a  y  s  i  n  (  1  1):=30;daysin(  1 D  )  :  =  3 1 ; 

IF  ((year  REM  400  )=0  )  OR 

((year  REM  100)=0  and  (year  REM  400)=0) 

T  H  E  N  d  a  y  s  i  n  ( 2  )  :  =  2  8  E  L  S  E  d  aysinU?)  :  = 0  ; 

COMMENT  set  daysin(2)  according  to  whether  or  not 
year  is  leap  year; 
days ; =day2  +  (days i  ru  m  o  n  th 1 ) - d a y 1 ) ; 

COMMENT  this  yields  the  number  of  days  in  complete 
i  n  t. e r  ven  i  ng  m o nt.lis  ; 

FOR  i  :  =  mon  t  h  1  +1  (I  NT  1 1,  month,’  -  I  DO  d  a  y  s  :  =  d  a  y  s  i  n  (  i )  «■  d  a  y  s  ; 
COMMENT  add  in  t ho  days  in  complete  months; 

END 

W  R  I  T  E  (  d  a  y  s  ) 

E  N  D ; 


Fig  ur  e  6  . 
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5 . 4  Data  Flow  Errors.  A  program  may  access  a  variabl< 
one  of  three  ways, 
the  result  of  a  statement  is  to 
variable.  A  variable  is  said  to  be  referenc 
is  required  by  the  execution  of  a  statement 
variable  is  said  to  be  undefined  if  the  s 
language  does  not  explicitly  give  any  other 
variable.  Examples  of  the  latter  are  the 
storage  after  procedure  return  or  Fortran  DO 
after  normal  loop  termination. 

Following  F  o  s  d  i  c  k  and  Osterweil  [OF,1] 
types  of  data  flow  anomalies  which  are  often 
program  errors.  These  anomalies  are  consecu 
a  variable  of  the  following  forms: 

1.  undefined  then  referenced, 

2  .  defined  then  undefined, 
j.  defined  then  redefined. 

Anomaly  1  is  almost  always  indicative  of  an  error,  even 
if  it  occurs  only  on  a  single  path  between  the  point  at 
which  the  variable  becomes  undefined  and  its  point  of 
reference.  Anomalies  2  and  3  tend  to  indicate  errors  when 
they  are  unavoidable,  that  is,  when  they  occur  along  a  out 
set  of  the  flow  graph. 

The  second  and  third  types  of  anomalies  are  attacked 
directly  by  mutation  operators.  If  a  variable  is  defined 
and  is  not  used  then  in  most  cases  the  defining  statement 
can  be  eliminated  without  effect  (by  insertion  of  a  CONTINUE 

This  may  not  be  the  case  i  I'  in  the 


iable  a  function  with  side  effects 
the  definition  can  very  likely  be 
n o  e  f  feet  on  t  1 1 e  s i d e  e  f  f e e  t  , 
being  given  different  values.  An 

in 


statement  for  instance) 
course  of  defining  the  v  a r 
is  invoked.  In  this  case, 
altered  in  many  ways  with 
resulting  in  the  variable 

attempt  to  to  remove  these  mutations  will  usually  result 
the  anomaly  being  discovered. 

It  is  more  difficult  to  see  which  operators  address 
anomalies  of  the  first  type;  the  underlying  errors  are  at¬ 
tacked  by  the  discipline  imposed  by  mutation  analysis. 
Recall  that  a  mutation  system  is  a  large  interpret ive  system 
for  automatically  generating  and  testing  mutants.  Whenever 
the  value  of  a  variable  becomes  undefined  it  is  set  by  the 

interpreter  to  the  unique  constant  UNDEFINED.  before  every 

variable  reference  a  check  is  performed  by  the  interpreter 
to  see  if  the  variable  has  undefined  values.  It'  the 

variable  is  UNDEFINED  the  error  is  reported  to  t  lie  user,  who 

can  then  take  action. 


5 .  *>  Domain  Errors.  The  notion  of  a  domain  error  is  due 
to  Howden  [Howl  ].  A  domain  error  occurs  when  an  input  value 
causes  an  incorrect  path  to  be  executed  due  to  an  error  in  a 
control  statement.  Domain  errors  are  to  bo  contrasted  with 
computation  errors  which  occur  when  an  input  value  cause;; 
the  correct  path  to  be  followed  but.  an  incorrect  funct  ion  o  f 
the  input  value  is  computed  along  that  path  due  t  o  an  error 
in  a  computation  statement.  These  notions  are  not  precise 
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and  it  is  difficult  with  many  errors  to  decide  in  which 
category  they  belong. 

A  method  of  reliably  uncovering  domain  errors  is  the 
domain  strategy  proposed  by  White,  Cohen,  and  Chandrasekaran 
IWCCJ.  For  a  program  containing  N  input  variables  (e.g., 
parameters,  arrays,  and  I/O  varibles),  any  predicate  in  the 
program  can  be  treated  as  an  algebraic  relationship  and  can 
thus  be  described  by  a  surface  in  the  N  dimensional  input 
space.  If,  as  often  happens,  the  predicate  is  linear,  then 
the  surface  is  a  hyperplane.  Consider  a  two  dimensional 
example  with  input  variables  I  and  J 

I+2J  <  -3. 

The  domain  stategy  tests  this  predicate  using  three 
test  points,  two  on  the  line 

I+2J=3. 

and  one  point  which  lies  off  the  line,  but  within  an  en¬ 
velope  of  width  2d  centered  on  the  line  (see  Figure  7)- 
Call  these  points  A  ,  B  and  C.  If  A,B,  and  C  yield  correct 
output,  we  know  that  the  defining  curve  of  the  predicate 
must  cut  the  sections  of  the  triangle  ABC.  Choosing  d  small 
enough  makes  the  chance  of  the  predicate  actually  being  one 
of  these  alternatives  small.  Therefore,  even  if  one  doesn't 
have  complete  confidence  that  the  predicate  is  correct,  we 
have  gained  some  inductive  confidence  that  the  predicate  is 
correct . 
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Mutation  analysts  also  deals  with  the  issue  of  domain 
errors.  Indeed  the  domain  strategy  can  be  implemented  using 
mutation  once  a  simple  observation  is  made:  it  is  not 
necessary  that  points  A  and  B  both  lie  on  the  line  —  it  is 
only  necessary  that  the  line  separate  them  or  that  they  do 
not  both  lie  on  the  same  side  of  the  line.  Hereafter  we 
will  work  with  the  domain  stategy  using  this  simplifying  as- 
sum  pt ion  . 

There  are  three  error  operators  which  generate  mutants 
causing  the  tester  to  generate  the  required  points. 
Intuitively,  we  can  think  of  mutation  analysis  as  posing 
certain  alternat  ives  to  the  predicate  in  question.  These 
alternatives  require  the  tester  to  supply  "reasons"  (in  the 
form  of  test  data)  why  the  alternative  predicate  cannot  be 
used  in  place  of  the  original. 

Relational  Operator  Replacement.  Changing  an 
inequality  operator  to  a  strict  inequality,  weakening  the 
operator,  or  changing  its  sense  generates  a  mutant  which  cati 
only  be  eliminated  by  a  test  point  which  exactly  satisfies 
the  predicate.  For  example  changing 

I  +  2J  £  3 
to 

I +2 J  <3 

requires  the  tester  to  generate  a  point  on  the  line 

I  «•.?  J  =3 

w h i c h  satisfies  the  first  predicate  but  which  does  not 
satisfy  the  second  predicate. 

Twiddle.  Twiddle  is  a  unary  operator  denoted  by  ♦+  or 
— ,  depending  on  its  sense.  In  the  FMS.2  system  + ♦ a  is 
defined  to  be  a-*- 1  if  a  is  an  integer  and  a+ .  0 1  ,  if  a  is 
real.  In  the  CMS.1  system,  ♦♦a  is  defined  to  be  sensitive 
to  the  magnitude  of  a.  The  complementary  operator  — a  is 
defined  similarly. 

Graphically,  the  effect  of  twiddle  is  to  move  the 
proposed  constraint  a  small  distance  from  the  original  line 
(see  Figure  8).  In  order  to  eliminate  these  mutants,  a  data 
point  must  be  found  which  satisfies  one  constraint  but  not 
the  other  and  is  hence  very  close  to  the  original  line. 
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Other  Replacements.  These  operators  replace  data 
references  with  other  syntactically  meaningful  data 
references  and  similarly  for  operators.  These  effects  are 
related  to  the  phenomenon  of  "spoilers"  which  are  described 
in  5.8. 

The  practical  effect  of  considering  so  many  alter¬ 
natives  is  to  increase  the  total  number  of  data  points 
necessary  for  their  elimination.  Th  i  s  leads  by  the  domain 
strategy  to  an  increased  confidence  that  the  predicate  has 
been  correctly  chosen. 

For  comparison,  let  us  work  through  the  program  in 
Figure  9,  which  was  used  by  White,  Cohen  and  Chand r a se ka r an 
[WCC]  to  illustrate  domain  strategies.  No  specifications 
are  given  for  this  program,  but  the  program  can  be  compared 
against  a  presumably  correct  version;  in  any  case  the 
program  is  useful  since  it  involves  only  two  input 
variables. 


READ  I , J ; 

IF  I  <.  J  +  1 

THEN  K  =  I ♦ J - 1 
ELSE  K  =2  *1  +  1  ; 
IF  K>  1  +  1 

THEN  L  =  I  1 
ELSE  L= J - 1 ; 

IF  1=5 

THEN  M  =  2  *L+K  ; 
ELSE  M  =  L  +2  #K  -  1 
WRITE  M; 


Figure  9 . 


The  program  has  only  three  predicates: 

I  <  J  «■!  ,  K  >  I  1  ,  and  I  =5  . 

The  effect  of  changing  the  first  of  these  is  typical,  so  we 
will  deal  with  it. 

Figure  10  is  a  listing  of  all  the  alternatives  tried 
for  the  predicate  I=<J+1.  Some  of  these  are  redundant 
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(  e  .  g  .  ,  +  *  I <  J  +•  1  and  I  <  - - J ♦ 1 )  ,  but  this  is  merely 

artifact  of  the  generation  device;  the  redundancies  can  be 
easily  removed  (see  Section  8).  The  alternative  predicates 
introduced  in  this  way  are  illustrated  in  Figure  11.  The 
original  predicate  line  is  the  heavy  line.  White  et.  al. 
hypothesize  that  the  program  of  Figure  9  contains  the 
errors  : 


s t a t em en t / ex pr e s s i o n  should  be 


i  K>  1  +  1 

K  >  1+2 

!  1=5 

I  =  5  —  J 

!  L  =  J -  1 

L  =  I  -2 

K  =1 +J  -  1 

THEN  IF(2*J<-5*I-40) 

THEN  K  =  3 ; 

ELSE  K  = I + J -  1  ; 

We  leave  ot  to  the  reader  to  verify  that  attempting  to 
eliminate  the  alternative  K  >  I +2  necessarily  ends  with  the 
discovery  of  the  first  error.  Note  that  this  is  not  trivial 
since  errors  1  and  4  can  interact  in  a  subtle  way.  In  the 
sequel  we  show  how  the  remaining  errors  are  dealt  with. 
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1 . 

I  F  ( I  <  J  ) 

2. 

I F (  I  <  J+2) 

3. 

I  F  ( I  <  J  +  1  ) 

4  . 

IF(  I  <  J  +J  ) 

5. 

I  F  (  1  <  J  +  1  ) 

6. 

IF(  2  <  J  +1  ) 

7  . 

I  F  (5  <  J  +1  ) 

8. 

IF C  I  <  1  +1 ) 

9. 

I  F  ( I  <  2  +  1  ) 

10. 

IF C I  <  5  +  1 ) 

1  1  . 

I F  ( I  <  J+5 ) 

12. 

IF ( -I  <  J  +1  ) 

13. 

I  F  (  +  + 1 <  J  +  1  ) 

1  4  . 

IF(  — I  <  J  +1  ) 

15. 

I  F  ( I  <  -J  +  1  ) 

16. 

IF C I  <  ++J  +  1  ) 

1  7  . 

I  F  ( I  <  --J  +  1  ) 

18. 

I F (  I  <  -  (J  +  1  )  ) 

19. 

I  F  ( I  <  J  -  1  ) 

20. 

I F ( I  <  MOD ( J ,  1 ) ) 

2  1  . 

I F  Cl  <  J  ) 

22. 

I F  (  I  <  1  ) 

23. 

I  F  ( I  <  J  +  1  ) 

24  . 

I F  (  I  =  J  + 1  ) 

25. 

I F(  . NOT. I = J  + 1 ) 

26. 

I F (  I  >J  +1  ) 

27. 

I F  ( I  >  J  +  1 ) 

Figure  10 
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Figure  11 


The  introduction  of  the  unary  +  +  and  --  operators  can 
he  generalized  in  several  useful  ways.  In  addition  to  the 
twiddle  operators,  we  consider  the  unary  operator  -  and  t  h  e 
e x t r a- sy n t jo t i c  operators  APS  (absolute  value),  -APS 
(negative  absolute  value),  and  ZPUSH  (zero  push).  Consider 
the  s  t  a  t.  e  m  e  n  t 

A  =  B  +C  . 

In  order  to  eliminate  the  mutants 

A  =  A  BS ( B) +C  , 

A  =  B  +  A  BS ( C)  , 


A  =  A  BS ( B+C)  , 

we  must  generate  a  set  of  test  points  in  which  B  is  negative 
vsc  that  B  +  C  differs  from  ABS(B  +  C)  ,  C  is  negative,  and  B+C 
is  negative  ). Notice  that  if  it  is  impossible  for  B  to  be 
negative  then  this  is  an  equivalent  mutation.  That  is,  the 
altered  program  is  equivalent  to  t  h e  original  one.  In  this 
c a s  e ,  t  h e  proliferation  of  these  alternatives  can  e  i  t  h e  r  be 
a  nuisance  or  an  important  documentation  aid,  depending  upon 
the  testers'  point  of  view.  The  topic  of  equivalent  mutants 
will  be  taken  up  again  later. 

In  similar  fashion,  negative  absolute  value  insertion 
forces  the  test  data  to  be  positive.  We  use  the  term  domain 
pushing  for  this  process.  By  analogy  to  the  domain 
strategy,  these  nutations  push  the  tester  into  producing 
test  cases  where  the  domains  satisfy  the  given  requirements. 

Zero  Push  is  an  operator  defined  so  that  ZPUSH(x)  is  x 
if  x  is  nonzero  ,  and  otherwise  is  undefined  so  that  the 
mutant  dies  immediately.  Hence  the  elimination  of  this 
mutant  requires  a  test  point  in  which  the  expression  x  has 
the  value  zero. 

Applying  this  process  at  every  point  where  an  absolute 
value  sign  can  be  inserted  gives  a  scattering  effect.  The 
tester  is  forced  to  include  test  cases  acting  in  various 
positions  in  several  problem  domains.  Very  often,  in  the 
presence  of  an  error,  this  scattering  effect  causes  a  test 
case  to  be  generated  in  which  the  error  is  explicit. 

Returning  to  the  example  in  Figure  9,  we  can  generate 
the  additional  alternatives  shown  in  Figure  1C.  Figure  13 
shows  the  domains  into  which  these  mutants  push.  Even  this 
simple  example  generates  a  large  number  of  requirements! 


One  effect  of  the  error  L  =  J  —  1  is  that  any  test  point  in 
the  area  bounded  by  I  =  J  +  1  and  1=1  will  return  an  incorrect 
result.  But  this  is  precisely  the  area  that  mutants  8,9, 
and  10  push  us  into.  So,  the  error  could  not  have  gone  un¬ 
discovered  in  mutation  analysis. 

This  process  of  pushing  the  tester  into  producing  data 
satisfying  some  criterion  is  also  often  accomplished  by 
other  mutations.  Consider  the  program  in  Figure  14,  which 
is  based  on  a  text  reformatter  program  by  Nauer  [Nau]  and 
which  has  been  previously  studied  in  the  program  testing 
literature  [GG]. 
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alarm:  ^ FA LSE 
b  u  f po  s : =0  ; 
f  i  1 1  :  =  0  ; 

REPEAT 

ineharacter(  ch'  ; 

IF  ew-BL  or  cw=NL  THEN 

IF  fill+bufpos  maxpos  THEN 
outcharacterf BL  )  ; 

ELSE 

BEGIN 

outcharacter( N  L  )  ; 
f i 1 1 : =0  ; 

FOR  k : = 1  STEP  1  UNTIL  bufpos  DO  outcharaot 
fill :  =  f i 1 1 +  b  u  f p  o s ; 
b  u  1'  po  s  :  =  0 
END 
ELSE 

IF  bufpos  r  maxpos  THEN  alarm : =TRUE ; 

ELSE  BEGIN 
bufpos :=bufpos+l ; 
buffer[bufpos] : =cw 
END 

UNTIL  alarm  or  cw=ET 


Figure  14. 
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Consider  the  mutant  which  replaces  the  first  statement 
f  i  1 1  :  =  0  wit  it  the  statement  fill  :  =  1  .  The  effect  of  this 
mutation  is  to  force  a  test  case  to  be  defined  in  w  hie  it  the 
first  word  is  less  than  tnaxpos  characters  long.  This  test 
case  then  detects  one  of  the  five  errors  originally  reported 
in  tiie  program  [GO].  The  surprising  tiling  is  that  the  ef¬ 
fect  of  this  mutation  seems  to  be  totally  unrelated  to  the 
statement  in  which  the  mutation  takes  place! 

5.6  Special  Values.  Another  form  of  test  which  lias 
been  introduced  by  Howiien  l  How,?  I  is  special  values  testing. 

Testing  of  special  values  is  defined  in  terms  of  a  number  of 
"rules".  Fo  r  e  x  am  pie: 

1.  Every  subexpression  should  be 
tested  on  at  least  one  test  c  a  s  e  will  c  h 
forces  tiie  expression  to  be  zero. 

2 .  Every  variable  and  every  subex¬ 
pression  should  take  on  a  distinct  set 
of  values  in  the  test  case. 

The  relationship  between  the  first  rule  and  domain  push-  ( 

ing  (via  zero  values  mutations)  has  already  been  discussed.  I 

The  second  rule  is  undeniably  important.  If  two  variables 
are  always  given  the  same  value  then  they  are  not  acting  as 
free  variables  and  a  reference  to  the  first  can  be  uniformly 
replaced  with  a  reference  to  tiie  second.  Put  this  is  also 
an  error  operator  and  tiie  existence  of  these  mutations  en¬ 
forces  tiie  goals  of  Rule  ,? . 

A  slightly  more  general  method  of  enforcing  Rule  G 
might  use  the  following  device.  A  special  array  exactly  as 
large  as  the  number  of  subexpressions  to  be  computed  in  tiie 
program  is  kept.  Each  entry  in  this  array  has  two  ad¬ 
ditional  tag  bits  which  are  intialized  to  their  low  values 
indicating  ttiat  tiie  array  is  uninitialized.  As  cacti  subex¬ 
pression  is  encountered  in  turn,  the  value  at  that  point  is 
recorded  in  the  array  and  the  first  tag  bit  is  set.  Sub¬ 
sequently,  when  the  subexpression  is  again  encountered  i f 
the  second  tag  is  still  off  the  current  value  of  the  expres¬ 
sion  is  compared  against  tiie  recorded  value.  If  these 
values  differ  the  second  tag  is  set  to  high  values;  other¬ 
wise  no  change  is  made.  By  counting  those  expressions  in 
which  the  second  tag  bit  is  low  and  the  first  is  high  one 
can  infer  which  expressions  have  not  had  their  values  al¬ 
tered  over  the  test  case.  Mutations  could  be  constructed  to 
reveal  this.  This  technique  is  similar  to  one  used  in  a 
compiler  system  by  Hamlet  l Mam] 

5.7  Coincidental  Correctness.  The  result  of  evaluat¬ 
ing  a  given  test  point  is  coincidentally  correct  it'  t  he 
result  matches  the  intended  value  in  spite  of  a  computation 
error.  For  example,  if  all  our  test  data  results  in  the 
variable  I  taking  on  the  values  and  0,  then  tin'  oemput  a- 
t  i on 

J  -  1 
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may  be  coincidentally  correct  if  the  intended  calculation 
was 

J=I*«2. 

The  problem  of  coincidental  correctness  is  really 
central  to  program  testing.  Every  programmer  who  tests  an 
incorrect  program  and  fails  to  find  the  errors  has  really 
encountered  an  instance  of  coincidental  correctness.  In 
spite  of  this,  there  has  been  no  direct  assault  on  the 
problem  and  some  authors  have  gone  so  far  as  to  say  that  the 
problems  of  coincidental  correctness  are  intractable  l WCC  ]  . 

In  mutation  analysis,  coincidental  correctness  is  at¬ 
tacked  by  by  the  use  of  spoilers.  Spoilers  implicitly 
remove  from  consideration  data  points  for  which  the  results 
could  obviously  be  coincidentally  correct  --  this  "spoils" 
those  data  points.  For  example  by  explicitly  creating  the 
mutation 

J  =  I  *2  =  =  >  J  = I  * *2 

we  spoil  those  test  cases  for  which  1=0  or  1=2  are 
coincidentally  correct  and  require  that  at  lest  one  test 
case  have  an  alternative  value. 

Continuing  with  the  example  of  Figure  9,  Figures  15  and 
16  show  the  spoilers  and  their  effects  associated  with  the 
statement  M=L+2*K-1.  Notice  that  a  single  spoiler  may  be 
associated  with  up  to  four  different  lines  depending  on  the 
outcome  of  the  first  two  predicates  in  the  program.  In 
geometric  terms,  the  effects  of  the  spoilers  are  that  within 
each  data  domain  for  each  line  there  must  be  at  least  one 
test  case  which  does  not  lie  on  the  given  line.  In  broad 
terms,  the  effects  of  this  are  to  require  that  a  large  num¬ 
ber  of  data  points  for  which  the  possibilities  of 
coincidental  correctness  are  very  slight. 
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1.  M= (L+1 *K)  -  1 

2.  M=(L+3*K)-1 

3.  M  =  ( I +2  *K) -  1 

4.  M=(J+2*K)-1 

5.  M  = ( K  +2  *K ) -  1 

6.  M  = ( L  +  2  * J )-1 

7.  M  = ( L  +2  *  I  )  -  1 

8.  M= ( L+2 *L  ) - 1 

9.  M=(L+I*K)-1 

10.  M  =  (  L  +  J  *K  )  -  1 

11.  M= ( L  +K  *K  )  -  1 

12.  M=  (  L  +L  *K ) -  1 

13.  M= (L+2  *K ) -I 

14.  M  =  ( L  +2  *K  )  -  J 

15.  M= (L+2  *K  ) -K 

16.  M=(L+2»K)-L 

17.  M=  ( 1  +2  *K  )  -  1 

18.  M  =  ( 2  +2  *K  )  -  1 

19.  M  = ( 5  +2  *K ) -  1 

20.  M  = ( L  +2  *  1  )-1 

21.  M= (L+2  *2 ) -  1 

22.  M=(L+2«5)-1 

23.  M  = ( L  +  5  *K ) -  1 

24.  M= (-L+2  *K  )  - 1 

25.  M=(L+-2»K)-1 

26.  M= ( L  +2  *-K ) - 1 

27.  M=(L+2» —  K ) -  1 

28.  M  =  - ( L  +2  *K ) -  1 

29.  M  =  -(  (L+2«K)-1  ) 

30.  M  = ( L  +2  +K  )  -  1 

31.  M= ( L  +2-K ) -  1 

32.  M  = ( L  +M  OD ( 2 , K ) ) -  1 

33.  M=(L+2/K)-1 

34.  M=(L+2**K)-1 

35.  M= (L+2 ) - 1 

36.  M  =  ( L  +K  )  -  1 

37.  M  =L-2  *K - 1 

38.  M  = ( MOD ( L , 2*K ) ) - 1 

39.  M=L/2»K-1 

40.  M  =  L  *2  *K  -  1 

41.  M=L«*(2*K)-1 

42.  M=L-1 

43.  M=(2»K)-1 

44.  M  =  L  +2  *K  +  1 

45  M  =MOD ( L  +  2  *K , 1  ) 

46.  M  =  (  L  +2  *K  )  /  1 

47.  M=(L+2*K)*1 

48.  M= (L+2*K  )  ** 1 

49.  M  = ( L  +2  *K  ) 

50.  M  =  1 


Figure  15. 
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Often  the  fact  that  two  expressions  are  coincidentally 
the  same  over  the  input  data  is  a  sign  of  a  program  error  or 
of  poor  testing.  The  sorting  program  of  Figure  17  is  from 
tWir],  and  it  performs  correctly  for  a  large  number  of  input 
values.  If,  however,  the  statements  following  the  IF 
statement  are  never  executed  for  some  loop  iteration  it  is 
possible  for  R  j  to  be  incorrectly  set  and  an  incorrectly 
sorted  array  will  result. 

By  constructing  the  mutant  which  replaces  the  statement 
a ( R 1 ) : =  RO  =  =  >  a(R1):  =  a(R3) 

it  is  clear  that  there  are  two  ways  of  defining  RO,  only  one 
of  which  is  used  in  the  test  data.  This  exposes  the  error. 


PACK  J4S 


FOR  R  1  =  0  BY  1  TO  N  BEG I N 
RO: =a( R  1  ) ; 

FOR  R2  =  R 1 ♦  1  BY  1  TO  N  BEGIN 
IF  a(R2) >RO  THEN  BEGIN 
RO:=a(R2); 

R  3  :  =  R  2 

END 

END 

R  2  :  =  R  0  ; 
a (R 1)  : =  RO ; 
a  (  R  j  )  :  =  R  2 
END; 


Figure  17 
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5.8  Hissing  Path  Errors.  A  program  contains  a  missing 

path  error  if  a  predicate  is  required  which  does  not  appear 

in  the  subject  program,  causing  some  data  to  be  computed  by 
the  same  function  when  an  altogether  different  function  of 
the  input  data  is  called  for.  The  definition  is  due  to  How- 
den  [ How2  ]  .  Such  missing  predicates  can  really  be  the 
result  of  two  different  problems,  however,  so  we  might 
consider  the  following  alternative  definitions. 

A  program  contains  a  speci f icational  missing  path  error 

if  two  cases  which  are  treated  differently  in  the 

specifications  are  incorrectly  combined  into  a  single  func¬ 
tion  in  the  program.  On  the  other  hand,  a  program  contains 
a  computational  missing  path  error  if  within  the  domain  of  a 
single  specification  a  path  is  missing  which  is  required 
only  because  of  the  nature  of  the  algorithm  or  of  the  data 
i n volved  . 

An  example  of  a  s pe c i f  i  c a t  i  on  a  1  error  is  the  fourth 
error  from  the  example  in  Section  5.5.  Although  this  error 
might  result  from  a  specification  there  is  nothing  in  the 
code  itself  which  could  give  any  hint  that  the  data  in  the 
range 

2* J  <  5*1-40 

is  to  be  handled  any  differently  than  shown  in  the  program. 

As  an  example  of  the  second  class  of  path  error 
consider  the  subroutine  shown  in  Figure  18,  which  is  adapted 
from  [KP].  The  input  consists  of  a  sorted  table  of  numbers 
and  an  element  which  may  or  may  not  be  in  the  table.  The 
only  specification  is  that  upon  return 

X  CLOW)  £  A  IX (HIGH ) 

and 

HIGH  <  LOW  +  1  . 

A  problem  arises  if  the  program  is  presented  with  a  table  of 
only  one  entry,  in  which  case  the  program  diverges. 

In  the  specifications  there  is  no  clue  that  a  one-entry 
table  is  to  be  treated  any  differently  from  a  k>1  entry 
table.  The  algorithm  makes  it  a  special  case. 


i 


SUBROUTINE  B  I N ( X , N , A . LOW , H I GH ) 
INTEGER  X(N)  ,N. A .LOW, HIGH 
INTEGER  MIP 
LOW  =  I 
HIGH-N 

IF( H  IGH-LOW- 1 )7 ,  12.7 
RETURN 

M  I D= (LOW  +  H IGH ) /2 

IF(A-XiMID)  )0,  10,  10 

H IGH  =M  I D 

GO  TO  6 

LOW  =  M  ID 

GO  TO  6 

END 


Figure  18. 


PAGE  4  8 


Computational  missing  path  problems  are  usually  caused 
by  requirements  to  treat  certain  values  (e.g.,  negative  num¬ 
bers)  differently  from  others.  When  this  occurs,  data  push¬ 
ing  and  spoiling  often  lead  to  the  detection  of  the  errors. 
In  the  example  under  consideration  here  an  attempt  to  kill 
either  of  the  mutants 

IF  C H IGH-LOW-1  ) 1 2 ,  12,7 
o  r 


MID= (LOW  +  H  IGH  )-2 

will  cause  us  to  generate  a  test  case  with  a  single  element. 

Since  mutation  analysis  —  like  all  testing  techniques 
deals  mainly  with  the  program  under  test,  the  problem  of 
dealing  with  specificational  missing  path  errors  appears  to 
be  considerably  more  difficult.  Under  the  Competent 
Programmer  Assumption  and  the  Coupling  Effect,  however,  a 
tester  who  has  access  to  an  "oracle"  for  the  program 
specifications  can  assume  that  the  mutants  cover  all  program 
behavior!  So  by  consulting  the  specifications  the  tester 
can  detect  missing  paths  by  noting  incomplete  behavior  and 
thus  uncover  any  missing  paths.  But  since  the  assumptions 
of  a  competent  programmer  and  coupling  are  statistical  and 
since  it  may  be  infeasible  to  check  for  incomplete  behavior, 
the  chances  of  detecting  such  missing  paths  are  not  certain. 

To  see  this  failure,  consider  the  missing  path  error 
from  section  5.5.  It  is  possible  to  generate  test  data 
which  is  adequate  but  which  fails  to  detect  the  missing  path 
error  because  there  is  no  oracle  to  consult  for  completeness 
of  behavior.  This  appears  to  be  a  fundamental  limitation  of 
the  testing  process.  Unlike,  say,  program  verification, 
program  testing  does  not  require  uniform  a  priori 
specifications;  rather  we  only  ask  that  the  tester  be  able 
to  judge  correctness  on  a  case- by-case  basis.  It  is  our 
view  that  the  only  way  to  attack  these  problems  is  to  start 
with  a  core  of  test  cases  generated  from  specifications,  in¬ 
dependent  of  the  subject  program.  This  core  of  test  cases 
can  then  be  augmented  to  achieve  stronger  goals.  We  note 
that  some  preliminary  work  on  generating  test  data  from 
specifications  has  already  been  reported  [GG.OWj. 

5.9  Missing  Statement  Errors.  By  analogy  with  missing 
path  errors,  a  missing  statement  error  is  defined  by  a 
statement  which  should  appear  in  the  program  but  which  does 
not.  It  is  not  clear  that  the  techniques  of  statement 
analysis  can  be  used  to  uncover  these  errors.  In  fact,  it 
is  rather  surprising  that  mutation  analysis  --  a  technique 
which  is  directly  oriented  toward  examining  the  effect  of  a 
modification  to  a  statement  --  can  be  used  to  detect  missing 
statements  at  all! 

To  see  how  this  can  be  accomplished,  consider  the 
program  shown  in  Figure  19.  This  program  accepts  a  vector  V 
of  length  N  and  returns  in  M PS U M  the  value 
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V ( i ) +V ( i+ 1  )  +  .  .  .  +  V  (  N  ) 


where  j=i-1  is  the  smallest  index  such  that  V(j)  is  strictly 
positive.  In  degenerate  cases,  MPSUM=0  is  returned. 

There  is  a  missing  RETURN  statement  which  should  follow 
the  IF  statement.  The  effect  of  the  error  is  to  cause  un¬ 
defined  behavior  when  the  vector  V  is  uniformly  nonpositive 
(undefined,  since  DO  loop  variables  are  of  indeterminate 
value  after  normal  completion  of  the  loop). 

A  simple  mutation  of  MPADD  is  the  transformation 

DO  1  1=  1  ,N  =  =  >  DO  1  1=  1  ,N+1  . 

This  mutant  fails  only  when  the  loop  executes  N+1  times.  In 
this  case  all  elements  of  V  are  nonpositive  and  the  original 
program  fails,  so  eliminating  this  mutant  uncovers  the 
error.  But  even  after  adding  the  return  statement,  MPADD 
will  still  be  incorrect  due  to  a  missing  path  error.  We 
leave  it  to  the  reader  to  discover  the  error  by  considering 
the  mutant 

DO  1  1=1 , N  ==>  DO  1  1=1 , N-1 . 
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SUBROUTINE  M  P A  DD ( V , N 
INTEGER  V l N )  , N , M  PSUM 
MPSUM  r  0 
DO  1  I  =  1  ,  N 

IF ( V ( I ) . GT . 0 )GO  TO  2 
M  =  I+  1 

DO  3  I  =  M  ,  N 
MPSUM=MPSUMt-V  I  ) 
RETURN 
END 


Figure  19. 


MPSUM) 
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6.  A  CASE  STUDY 

To  see  the  effect  of  mutation  analysis  on  a  tester  who 
is  attempting  to  locate  and  remove  program  errors,  it  is 
worthwhile  to  examine  a  debugging  session  for  a  program  that 
is  not  known  beforehand  to  be  "testable".  This  case  study 
differs  from  previous  mutation  dialogs  which  we  have 
reported  [ D  LS 1 , DLS2 , LS ]  in  that  our  previous  reports  dealt 
with  programs  strongly  believed  to  be  correct,  for  which 
mutation  analysis  was  used  as  a  tool  to  increase  our  con¬ 
fidence  in  the  program's  correctness  .  The  subject  program 
to  be  discussed  here  is  known  to  contain  at  least  one 
"resistant"  error;  the  error  had  resisted  all  of  the  usual 
debugging  techniques  such  as  selective  traces  and  statement 
instrumentation.  Hence,  mutation  analysis  is  used  here  not 
as  a  test  data  evaluator  but  as  a  tool  for  systematic  debug¬ 
ging  and,  perhaps  just  as  importantly,  as  a  convenient  run 
time  environment  for  Fortran  subroutines. 

The  subject  program  is  a  routine  called  NXTLIV.  It  is 
a  key  routine  in  the  CMS.1  system  and  can  be  considered  a 
production  program  for  purposes  of  testing.  NXTLIV  accepts 
as  input  the  identifying  number  of  a  mutant  of  a  given  type 
and  returns  the  number  of  the  next  live  mutant,  as  indicated 
by  bit  maps  of  the  live  mutants.  The  bit  maps  are  in 
general  too  large  to  fit  in  an  internal  array  so  they  must 
be  paged  from  a  random  access  disk  file  as  needed.  Similar 
maps  of  the  dead  mutants  and  equivalent  mutants  are  also 
stored.  The  subject  program  is  shown  in  Figure  20. 
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SUBROUTINE  N  XTLIV ( MTYPE, MUTNO) 

C  FIND  THE  NEXT  LIVE  MUTANT  AFTER  THE  MUTNOth  OF  TYPE  MTYPE 
C  RETURN  THIS  VALUE  IN  MUTNO. 

C  A  VALUE  OF  ZERO  RETURNED  MEANS  NO  MUTANTS  OF  THAT  TYPE 
REMAIN  ALIVE. 

NOLIST 

$INSERT  ICS057>CPMS. COMPAR>SYSTEM  .  PAR 
$  I NSERT  I CS 057  >C PMS.COM  PA R>MACHINE. SIZES.  PAR 
$  I NSE  RT  ICS057>CPMS. COMPAR>F ILENM . COM 
$  I N  S  E  RT  I CS057  >C  PMS . COM  PAR  >TSTDAT . COM 
$  I N  SE  R  T  ICS05 7  >C  PM S. COM  PA  R>M SBUF.COM 
LIST 

INTEGER  MTYPE .MUTNO 
INTEGER  I,J,K,L,WORD,BIT 
LOGICAL  ERR 

C  CALL  TIMER  1  (33  ) 

C  ASSUME  THAT  THE  RECORD  CONTAINING  THE  LIVE  BIT  MAPS  FOR 
C  MUTNO  IS  ALREADY  PRESENT,  UNLESS  MUTN0=0. 

K  =B  PW- 1 

C  CHECK  TO  SEE  IF  WE  ARE  AT  THE  END  OF  A  PHYSICAL  RECORD 
IFCMUTNO.EO.O  )T0  TO  1 
I F ( MOD( MUTNO , K*MSFRS ). EQ. 0 )G0  TO  24 
GO  TO  10 

1  CALL  REARAN ( MSFILE , LIVBUF .MSFRS , LI VPTR , ERR ) 

IF(ERR) CALL  ABO RT (  '  ( N XT L I  V  )  ERROR  IN  MUTANT  STATUS  FILE', 3b) 
CALL  REAR AN ( MSFILE , EQUBUF .MSFRS , EQUPTR , ERR ) 

IF(ERR)CALL  ABO RT (  '  ( N XTL I  V  )  ERROR  IN  MUTANT  STATUS  FILE', 36) 
CALL  REAR  AN ( MSFILE , DEDBUF .MSFRS , DE DPT R , ERR ) 

IF(ERR)CALL  ABO RT (  ’  ( N XTL I  V  )  ERROR  IN  MUTANT  STATUS  FILE', 36) 
CHANGD: . FALSE . 

WO  R  D  =  1 
B  I T  =  2 
GO  TO  20 

10  WORD  =  MOD(( MUTNO) /(K)  ,MSFRS)+1  . 

B IT  =MOD ( M  UTNO , K )+2 

20  DO  22  J  =  W 0 R  D , MSFRS 
L=LIVBUF(J ) 

IF(L. N E . 0 ) G 0  TO  23 
MUTNO=MUTNO+K 
I F ( MUTNO. GT.MCT)GO  TO  40 
GO  TO  22 

23  DO  21  I=BIT,BPW 
M  UTNO  =M  UTNO+  1 

IF ( MUTNO. GT.MCT)G0T040 
I F ( AND ( L ,2** ( BPW-I ) )  . NE .0  )G0  TO  30 

21  CONTINUE 
B  IT  =  2 

22  CONTINUE 

24  0F(  .  NOT.  CHANGD)  GOTO  2C3 
C  SAVE  OLD  RECORDS 

CALL  WRTRAN(MSFIL E, LIVBUF, MSFRS. LIVPTR, ERR) 

CALL  WRTRANCMSFILE, EQUBUF. MSFRS, EQUPTR, ERR) 

CALL  WRTRAN  (  MSF  ILF. ,  DEDBUF  ,  MSFRS  ,  DF  DPT  R  ,  ERR  ) 

C  NEED  TO  GET  NEXT  RECORDS 

25  LIVPTR=LIVPTR+MSFRS 
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EQUPTR=EQUPTR+MSFRS 
DEDPTR=DEDPTR+MSFRS 
GO  TO  1 

30  GO  TO  9999 

H  0  M  U  T  N  0  =  0 

I F t  .NOT. CHAN GD)GO  TO  9999 
C  SAVE  OLD  RECORDS 

CALL  WRTRAN ( MSFILE , LIVBUF .MSFRS , LIVPTR , ERR) 
CALL  W RTR AN ( MSFILE. EQUBUF, MSFRS, EQUPTR, ERR) 
CALL  WRTRAN ( MSFILE , DEDBUF , MSFRS . DEDPTR . ERR ) 
9999  CONTINUE 

C  CALL  TIMER? 

RETU RN 
E  N  D 


Figure  20. 
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Since  FMS.1  provides  a  more  user-oriented  environment 
than  FMS.2,  NXTLIV  was  tested  using  FMS.1.  To  adapt  to  the 
smaller  Fortran  subset  of  FMS.1,  some  modifications  had  to 
be  made.  Since  FMS.1  does  not  accept  PARAMETER  statements 
the  parameters  BPW  and  MSFRS  (from  the  $INSERT  blocks)  were 
replaced  with  typical  values.  Allowances  had  to  be  made  for 
the  unsupported  CALL  and  the  random  I/O  routines.  The  two 
TIMER  calls  were  ignored.  Integer  arithmetic  was  used  to 
simulate  the  remaining  features.  To  facilitate  testing 
several  parameters  are  entered  as  explicit  formal 
parameters  . 

FMS.1  first  asks  for  the  parameter  values: 

MUTNO  =  0 

MCT  =  6  ( MCT  is  the  total  number  of  mutants  of  current  type) 
CHANGD  =  0 

L I VB  UF (  1  )=LIVBUF(2)=7 
LIVBUFC3) =LIVBUF(4) =0 

NLB(1)=...=NLB(4)=0  (NLB  is  the  next  live  buffer.  It  should  be 

transferred  to  LIVBUF  for  use  immediately) 
LLB( 1 ) = . . ,=LLB( 4 ) =0  (LLB  is  the  last  live  buffer) 

Once  the  data  is  entered  the  system  executes  NXTLIV  on 
the  test  points  and  responds: 

PARAMETERS  ON  OUTPUT 

MUTNO  =  0 

LI VBUF( 1 ) =0 

L I VBUF( 2  )  =0 

LIVBUFC3) =0 

L  I  VB  UF  (  4  )  =  0 

LLB( 1 ) =0 

LLB ( 2 ) =0 

L  L  B ( 3 ) =0 

LLB( 4  )  =0 

CHANGD=0 

THE  RAW  PROGRAM  TOOK  41  STEPS  TO  EXECUTE  THIS  TEST  CASE 


The  output  MUTNO=0  signifies  that  the  end  of  the  live 
mutant  map  for  this  type  has  been  reached.  The  tester  then 
informs  the  system  that  NXTLIV  has  worked  correctly  for  this 
test  case.  The  first  type  of  mutant  to  be  investigated  by 
the  tester  is  SAN  (Statement  Analysis),  which  replaces 
statements  by  traps.  The  FMS.1  mutation  report  for  this  run 
is  as  shown  below. 

POST  RUN  PHASE 

NUMBER  OF  TEST  CASES  =  1  NUMBER  OF  MUTANTS  =  44 

NUMBER  OF  LIVE  MUTANTS:  23  PCT.  ELIMINATED  MUTANTS  =  47.73 

Examination  shows  the  mutants  shown  in  Figure  21(a)  to 
be  still  live. 

In  attempting  to  kill  these  mutants  the  tester 
generates  the  testcases  2  and  3  (see  Figure  21(b)). 
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line 

statement 

has  been  changed  to 

16 

I F( ( MUTNO/ 1 2 ) *1 2. EQ. MUTNO )GO  TO  24 

TRAP 

17 

GO  TO  10 

TRAP 

32 

WORDr ( ( MUTNO/ 3 )-4»( ( MUTNO/ 3 ) /4 ) ) + 1 

TRAP 

34 

"  “  —  ”  -1 

BIT=MUTNO-3*(MUTNO/3)+2 

TRAP 

Figure  21(a) 
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Testcase  2  eliminates  twelve  of  the  remaining  SAN 
mutants.  Testcase  3.  on  the  other  hand  produces  the  output 

PARAMETERS  ON  OUTPUT 

M  UTNO  =  1  4 

LI VBUF(  1  )  =7 

LI VBUF(2 ) =7 

L  I  VB  UF ( 3  )  =7 

LI  VBUF(4 ) =0 

LLB(  1  )  =  1 

LLB(2) =3 

LLB( 4  )  =0 

L  L  B ( 5 ) =0 

THE  RAW  PROGRAM  TOOK  56  STEPS  TO  EXECUTE  THIS  TEST  CASE. 

An  error  has  been  detected;  the  correct  output  for  MUT- 
NO  is  13  instead  of  14.  This  error  resulted  from  choosing  a 
starting  point  in  the  middle  of  a  word  of  zero  bits.  NXTLIV 
ordinarily  searches  the  bits  of  each  word  looking  for  the 
next  "1",  but  for  efficiency  a  whole  word  is  compared  to 
zero  before  the  search  is  begun.  If  all  bits  are  set  low, 
MUTNO  is  incremented  by  the  word  length  and  the  next  word  is 
accessed.  A  correct  algorithm  would  increment  MUTNO  only  by 
the  number  of  bits  left  to  be  examined  in  the  word.  The 
only  way  this  can  make  a  difference  in  the  original  program 
is  for  NXTLIV  to  be  called  in  such  away  as  to  stop  at  a  "1" 
bit  in  the  middle  of  the  word,  which  is  otherwise  all  0's, 
and  then  by  a  mutant  failure  or  equivalence  (outside  the 
routine)  to  have  that  bit  turned  off  before  NXTLIV  is  called 
again  for  the  next  mutant  to  be  considered.  Obviously  this 
situation  is  so  rare  that  it  is  bound  to  defy  haphazard 
debugging  attempts  but  is  none  the  less  common  enough  to 
cause  irritation  in  a  pr od uc t i on- s i zed  Cobol  run. 

The  needed  fix  is  to  replace 

MUTN0=MUTN0+K 


by 


MUTNO =MUTN0+(K-( BIT-2 ) )  . 

After  eliminating  all  SAN  mutants  and  turning  on  the 
remaining  error  operators,  a  total  of  eleven  test  cases  kil¬ 
led  all  but  50  of  1,514  mutants,  about  96.7  percent  of  the 
total.  Eventually  the  tester's  attention  is  directed  to  the 
mutant  at  line  45 


B  I T  =  2  =  =>  1=2. 

The  testcase  15  in  Figure  21(b)  is  an  attempt  to  eliminate 
this  mutant.  The  program  again  fails  and  another  error  has 
been  found.  This  error  is  also  related  to  the  test  for  the 
entire  word  of  zeroes.  By  starting  in  the  middle  of  a  word 


PAGE  57 


/ 


of  zeroes,  the  BIT  pointer  is  not  correctly  set  to  2  to 
begin  searching  the  next  word.  The  correction  is  to  replace 


B I T  =  2 

22  CONTINUE 

by 

22  B I T  =  2 

An  interesting  note  is  that  this  "correction"  is  ac¬ 
tually  a  mutation  that  the  tester  would  have  had  to 
eliminate  in  any  event,  so  in  effect  the  error  was  uncovered 
by  the  coupling  effect  before  it  was  explicitly  considered. 

In  completing  the  analysis  of  NXTLIV  the  tester  of 
course  has  to  deal  with  the  equivalent  mutants.  This  sub¬ 
ject  will  be  discussed  in  more  detail  in  a  later  section. 
The  complete  analysis  of  the  corrected  program  required  the 
elimination  of  1,580  mutants.  The  corrected  algorithm  has 
since  been  running  without  known  failure  in  CMS.1. 


7.  SEEDING  AND  FAULTS 

There  are  two  previously  suggested  error  detection 
techniques  which  seem  to  bear  strong  resemblence  to  mutation 
analysis.  They  arise  in  different  settings  and  the 
relationship  of  mutation  analysis  to  both  of  them  has  been 
questioned  in  several  private  correspondences.  One  of  these 
is  the  error  seeding  technique  described  with  several  ap¬ 
plications  by  Gilb  [Gil]  and  the  other  is  fault  detection 
[Cha]  applied  to  circuit  design.  Mutation  analysis  has  al¬ 
most  nothing  in  common  with  error  seeding,  but  owes  a  great 
deal  to  fault  detection  work  in  switching  theory. 

The  idea  behind  error  seeding  is  to  insert  "random" 
errors  in  a  program.  This  approach  has  been  used  in  several 
studies  of  the  programming  and  debugging  process.  In  one 
experiment  the  seeds  were  used  to  calibrate  the  effec¬ 
tiveness  of  software  documentation  on  its  maintainability: 
in  another  experiment  the  number  of  errors  in  a  program  is 
estimated  by  inserting  the  seeds  and  then  uncovering  k 
errors,  using  the  percentage  of  those  k  errors  which  were 
seeded  to  infer  the  total  number  of  errors. 

On  the  surface  this  idea  seems  very  similar  to 
mutation.  Let  us  look  a  little  more  closely  at  the  notion 
of  "randomness"  which  is  so  crucial  to  the  technique. 
First,  if  we  inspect  the  results  of  the  experiments 
described  in  [Gil],  we  are  struck  by  the  lack  of  resolution. 
In  the  first  experiment  described  above,  for  example,  "ran¬ 
domly"  chosen  groups  of  programmers  were  given  various  sets 
of  clues  about  the  programs  to  be  debugged.  As  reported  by 
Gilb:  "Variations  between  individuals  in  homogeneously 
selected  groups  of  programmers  are  at  least  2  to  1  and  up  to 
10  to  1."  Furthermore,  the  interpretations  consistent  with 
the  experimental  results  tend  to  be  highly  suspect:  "The 
use  of  test  data  seems  to  be  less  effective  than  simple 
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source  program  reading." 

The  reason  for  such  results  is  apparent  in  the  follow¬ 
ing  description  given  by  Gilb  of  the  statistical  basis  for 
using  seeding  to  estimate  the  total  number  of  errors  in  a 
pr ogr  am  . 

How  many  fish  are  there  is  a  pond  or  a 
lake?  Let's  say  that  a  reasonably  large 
sample  of  1000  fish  are  marked  and  then 
allowed  to  mix  for  a  while  with  the 
total  population  in  the  pond.  If  we 
then  take  a  new  sample  of  1000  fish  and 
find  that  50  of  these  have  our  markings 
on  them,  this  gives  us  20,000  fish  as  a 
reasonable  estimate  if  we  accept  the 
original  sample  as  random  and  the  remix¬ 
ing  of  the  fish  as  homogeneous. 

This  seems  to  be  the  source  of  the  difficulty.  We  have 
strong  evidence  that,  first,  the  fish  tend  to  school  in  ways 
that  are  not  predictable.  So  in  order  to  get  a  truly  random 
sample  we  have  to  know  where  to  fish  beforehand,  and  second, 
the  marked  fish  show  truly  idiosyncratic  tastes  in  picking 
their  associations  in  the  pond.  In  particular,  there  seems 
to  be  no  way  at  all  of  insuring  that  the  sample  we  obtain 
neither  underestimates  nor  overestimates  the  original 
population  by  unpredictable  amounts.  In  less  prosaic  terms 
the  preponderance  of  evidence  obtained  through  mutation 
analysis  (see  [DLS2.LS]  for  indicative  studies)  is  that 
errors  do  not  occur  with  statistical  properties  that  make 
them  useful  for  error  seeding  studies.  Even  though  they  may 
be  considered  the  result  of  a  stochastic  process  whose 
properties  can  be  determined  for  small  well-defined  ag¬ 
gregates  .they  are  in  individual  programs  sporadic,  highly 
non-independent,  and  not  uniformly  distributed  through  the 
code.  It  is  precisely  because  the  inserted  errors  are  ran¬ 
dom  that  they  do  not  relate  in  a  regular  way  to  the  natural 
errors.  As  we  have  seen,  it  takes  much  care  in  the  choice 
of  error  operators  to  insure  that  specific  categories  of 
errors  are  reliably  detectable  by  mutation  analysis. 

A  hallmark  of  mutation  analysis  is  that  it  rests  on  the 
Competent  Programmer  Assumption;  we  explicitly  assume  that  a 
program  is  not  a  random  object.  A  program  once  it  is 
created  contains  errors  and  these  are  fixed, 
deterministically  located  objects.  In  order  for  a 
statistical  technique  to  be  applicable  to  a  given  program  a 
considerable  number  of  a  priori  assumptions  must  be  rather 
fully  justified.  It  is,  however,  possible  to  design  ex¬ 
periments  on  fixed  populations  of  programs,  whose  properties 
are  quantifiable,  which  will  reveal  statistical  properties 
of  such  hypotheses  as  the  Coupling  Effect.  But  this  is  an 
entirely  different  issue. 

To  clearly  draw  the  distinction  it  may  be  helpful  not 
to  think  of  the  mutants  as  being  errors,  but  simply  as  small 
perturbations  of  the  program's  structure.  As  we  have  seen, 
these  perturbations  have  the  effect  of  insuring  that  the 
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test  data  exercises  the  program  in  a  thorough  fashion.  If 
the  test  data  is  sensitive  to  the  perturbations,  then  one's 
confidence  that  what  was  written  was  what  was  intended  is 
correspondingly  increased.  If  on  the  other  hand,  the  test 
data  allows  one  to  alter  the  program  significantly  without 
changing  its  apparent  behavior,  then  one  has  little  con¬ 
fidence  in  the  test. 

Finally,  mutation  analysis  has  a  psycho-social  aspect 
that  error  seeding  cannot  have.  Even  if  error  seeding 
worked  perfectly,  the  assumptions  which  make  it  work  would 
also  insure  that  it  give  no  information  about  where  the 
remaining  natural  errors  occur  (statistical  independence 
insures  this).  Mutation  analysis  forces  a  controlled 
reconsideration  of  the  source  code.  It  leads  --  as  we  saw 
in  the  section  preceeding  this  one  --  to  a  situation  in 
which  the  tester  must  consider  statement  x  and  ask  himself 
"why  does  it  not  matter  if  statement  x  is  changed  to  x'?" 
The  possible  answers  are  that  statement  x  is  in  error,  that 
it  does  matter  but  the  test  data  does  not  reveal  it,  that  x 
is  equivalent  in  context  to  x',  or  that  the  programmer  does 
not  understand  statement  x  and  is  unable  to  give  a  reason. 
In  each  situation  information  about  the  program,  about  tine 
test,  and  about  the  programmer  is  revealed. 

Fault  detection  experimentation  is  a  classical  tech¬ 
nique  for  detecting  faults  in  switching  circuits.  The 
crucial  idea  is  that  one  systematically  "faults"  circuit 
elements  and  examines  the  input-output  function  of  the 
resulting  circuit  by  comparing  it  to  the  original  circuit 
[Cha].  This  is  the  key  idea  of  mutation  analysis.  There 
are,  however,  some  essential  differences  which  make  mutation 
analysis  applicable  on  a  larger  scale.  First,  the  principle 
use  of  fault  detection  is  to  check  circuit  deterioration, 
not  to  validate  design.  Second,  because  circuits  tend  not 
to  be  functionally  organized  the  technqiue  is  exhaustive 
when  applied  to  design  testing  (for  deterioration  ex¬ 
periments  there  is  frequently  fault  data  available  to  guide 
the  experimenter).  In  essence,  the  approach  adopted  by 
mutation  analysis  is  fault  detection  applied  to  systems  of 
high  functionality  in  the  presence  of  the  Competence 
Programmer  Hypothesis  and  the  Coupling  Effect.  This  sug¬ 
gests  that  perhaps  mutation  analysis  in  its  automated  form 
can  be  used  for  circuit  validation.  Perhaps,  although  the 
lack  of  functional  description  at  the  switching  element 
level  makes  it  hard  to  avoid  the  exhaustive  and  therefore 
combinatorially  explosive  growth  of  the  test  cases.  But 
technology  has  grown  in  an  unexpected  direction  in  the  last 
twenty  years,  and  the  digital  design  techniques  of  today 
seem  to  be  not  ill-suited  to  mutation  analysis.  In 
preliminary  hand  studies  to  be  reported  elsewhere,  we  have 
used  the  mutation  analysis  approach  to  test  micro-coded  cir¬ 
cuit  desigtis  with  surprising  success. 


8.  THE  PROBLEM  OF  MUTANT  EQUIVALENCE 
Experience  indicates  that  in  production  programs 


the 


number  of  equivalent  mutants  can  vary  between  2  %  and  5  %  of 
the  total  mutant  count.  In  more  finely  tuned  program  (see, 
eg,  our  analysis  of  FIND  in  [DLS1]  and  Burns'  analysis  of 
sorting  routines  [Bur]),  however  it  is  common  for  source 
statements  to  appear  in  a  particular  form  solely  for  ef¬ 
ficiency  reasons.  In  these  program  such  statements  can  be 
altered  without  affecting  the  output  behavior.  A  typical 
example  of  this  behavior  is  beginning  a  loop  at  2  instead  of 
1  or  0,  so  that  a  mutation  which  changes 

2  =  =  >  1 

for  example,  causes  an  extra  iteration  but  does  not  alter 
the  outcome  of  the  looping  operation.  In  tuned  programs, 
the  equivalent  mutants  can  comprise  as  much  as  10%  of  the 
total  . 

It  is  easy  to  show  that  equivalent  mutant  detection  is 
a  formally  undecidable  problem  (note  that  equivalent  mutant 
detection  is  not  obviously  the  same  problem  as  the  general 
equivalence  problem  for  program  schemata  [Man]).  Assume  a 
fixed  programming  language  which  is  expressive  enough  to  al¬ 
low  the  programming  of  all  recursive  functions,  and  let  PI 
and  P2  be  arbitrary  procedures  written  in  the  language. 
Since  "goto"  mutations  are  meaningful  and  likely  mutations, 
consider  the  following  program  to  which  goto  replacement  has 
been  applied. 

goto  L;  go  to  M ; 

L : Pi;  halt;  =  =  >  L : P 1  ;  h  a  1 1 ; 

M : P  2 ; h  a  1 1 ;  M:P2;halt; 

Clearly,  these  two  programs  are  equivalent  (that  is,  they 
either  halt  together  and  deliver  the  same  output  or  they 
diverge  together)  if  and  only  if  PI  and  P2  are  equivalent, 
and  that  is  undecidable  for  the  language  described  above. 
In  fact,  our  choice  of  language  is  needlessly  complex;  es¬ 
sentially  the  same  proof  holds  for  the  Fortran  subset  accep¬ 
ted  by  FMS.1  and  the  Cobol  subset  accepted  by  CMS.1. 

In  spite  of  this,  most  equivalent  mutants  are  stylized 
and  rather  easy  to  judge  equivalent.  This  is  perhaps  due  to 
the  Competent  Programmer  Assumption:  the  subject  program 
and  an  allegedly  equivalent  mutant  are  not  chosen  randomly 
--  in  fact,  they  are  chosen  by  a  very  careful  sieving  of  all 
possible  programs  and  the  structure  of  this  relationship 
should  be  something  that  one  can  exploit  in  determining 
mutant  equivalence. 

Before  we  proceed  it  may  be  instructive  to  examine  a 
few  instances  of  equivalent  mutants  which  show  this  struc¬ 
ture.  In  the  analysis  of  SCAN  (see  Section  2),  a  relatively 
large  number  of  mutants  resulting  from  the  transformation 

X  ==>  RETURN 

appear  as  live  mutants  on  even  very  good  test  data.  On 
closer  examination,  however,  most  of  these  reveal  that 
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X 


GO  TO  90, 


where  statement  labelled  90  is  itself  a  RETURN.  The 
programmer's  style  is  to  always  jump  to  a  common  RETURN 
statement,  allowing  an  easy  "proof"  of  equivalence. 

For  a  more  pregnant  example,  let  us  return  to  the 
NXTLIV  routine  described  above.  A  principal  source  of 
equivalent  mutants  in  that  example  was  the  troublesome  test 
for  a  word  of  zeroes.  Its  only  purpose  is  to  save  the  ef¬ 
fort  of  looking  through  the  words  bit  by  bit.  If  the  condi¬ 
tion  is  the  test  is  replaced  by  any  identically  true  expres¬ 
sion,  the  program  runs  a  bit  longer  but  is  otherwise 
identicaH  see  Figure  22(a)).  Similarly  the  mutation  shown 
in  Figure  22(b),  changes  the  performance  of  the  program  on¬ 
ly,  but  this  time  it  improves  it! 
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I F  (  L  .  N  E  .  0 ) GOTO  23  =  =  >  I F ( 1 2 . N E . 0 ) GO  TO  23 
(applied  at  line  34) 


Figure  22(a) 


IF(  MUTNO  .  GT  .  MCT)GOTO  40  =  =  >  IF(  MUTNO.GE  .MCDGOTO  40 
(applied  at  line  36) 


Figure  22(b). 


Figure  22. 
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These  last  two  examples  are  not  accidental.  Mutations 
of  a  program  are  remarkably  similar  to  simple  trans¬ 
formations  that  are  made  in  code  optimization;  it  is  not 
surprising  that  some  of  them  should  turn  out  to  be  optimiz¬ 
ing  or  de -optimizing  transformations.  Conversely,  correct¬ 
ness  preserving  optimizing  transformations  should  be  ap¬ 
plicable  to  detecting  equivalent  mutants.  If  this  is  a 
useful  heuristic  then  the  task  of  identifying  equivalent 
mutants  can  be  reduced  to  detecting  those  which  are 
equivalent  for  an  interesting  reason. 

Almost  all  of  the  techniques  used  in  optimizing  com¬ 
piled  code  can  be  applied  in  some  way  to  decide  whether  a 
mutant  is  equivalent  to  the  subject  program.  Some  optimiz¬ 
ing  transformations  are  widely  applicable  while  others  are 
severely  limited  in  scope.  We  will  give  a  sampling  of  the 
useful  transformations.  For  terminology  and  detailed 
discussions  see  [AU.Sch]. 

8.1  Constant  Propagation.  Constant  propagation  invol¬ 
ves  replacing  constants  to  eliminate  run-time  evaluation.  A 
typical  optimizing  transformation  would  replace  statement  3 
as  shown  below 


1 

A  =  1 

1 

A  =  1 

2 

B  =  2 

=  =  > 

2 

B  =  2 

3 

C  =  A  +B 

3 

C  =  3 

There  are  several  elegant  schemes  for  global  transformations 
of  this  form . 

Constant  propagation  is  most  useful  for  detecting  cases 
in  which  a  mutant  is  not  equivalent  to  the  subject  program; 
any  change  which  can  affect  the  known  value  of  a  variable 
can  be  detected  in  this  fashion.  The  mechanism  for  testing 
equivalence  of  mutants  using  constant  propagation  is  to  com¬ 
pare  at  all  points  after  the  mutation  site  the  constants 
which  are  globally  propagated  through  the  program.  If  they 
differ  it  is  likely  that  the  programs  are  not  equivalent. 
The  test  is  certain  if  there  is  a  RETURN,  HALT  or  some  other 
exit  statement  in  which  the  set  of  associated  constants 
contains  an  output  variable  and  if  there  is  a  path  from  the 
entry  point  of  the  program  to  the  exit  point.  This  is 
resolvable  by  dead  code  detection  (see  8.6). 

8.2  Invariant  Propagation.  Invariant  propagation 
generalizes  constant  propagation  by  associating  with  each 
statement  a  set  of  invariant  relations  between  data  elements 
(e.g.,  X<0  or  B=1).  Although  invariant  propagation  has  met 

with  limited  applicability  in  compiler  design,  it  is  a 
powerful  technique  for  detecting  equivalent  mutants, 
particularly  those  involving  relational  mutant  operators. 
These  operators  frequently  only  affect  an  expression  if  it 
has  a  certain  relationship  to  0.  For  example  S  x !  changes 
the  value  of  x  only  if  x<0.  In  the  program-mutant  pair 

I F ( A . LT . 0 ) GOTO  1  1 F ( A . LT . 0  ) GOTO  1 

B  =  A  =  =  >  B=ABS(A) 
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the  conditional  allows  us  to  determine  the  invariant  (  A  >  =  0  ) 
and  this  allows  us  to  determine  that  the  program  and  its 
mutant  are  equivalent  since  the  absolute  value  of  a  positive 
number  is  that  number. 

Invariant  propagation  is  enhanced  if  the  propagation  and 
testing  algorithms  exploit  transitivity  of  the  relations  and 
allow  the  replacement  of  an  invariant  by  a  weaker  one. 

8.3  Common  Subexpressions.  Perhaps  the  most  common  op¬ 
timization  is  to  recognize  calculations  which  are  repeated 
but  which  can  be  pr e-computed  .  For  example 

A  =  X  +Y 
B  =  X  +Y  +  Z 

calculates  X  +  Y  twice,  but  can  be  replaced  by  a  program  which 
uses  a  temporary  variable  to  hold  X+Y. 

A  common  iterative  algorithm  for  eliminating  common 
subexpressions  uses  global  analysis  to  associate  with  eacii 
statement  the  propagated  variables,  but  this  time 
partitioned  into  equivalence  classes  under  the  equivalence 
of  evaluating  to  the  same  value.  Since  this  method 
generates  equivalent  expressions  not  used  in  the  program, 
the  widest  possible  range  of  equivalent  subexpressions  is 
recognized.  This  is  a  very  useful  technique  for  dealing 
with  mutations  to  assignment  statements.  Changing  an 
operator  changes  the  equivalence  class  of  the  variable  to 
which  the  assignment  was  made.  Similarly  mutations  which 
change  an  operand  or  destination  in  an  assignment  will 
produce  changes  in  the  equivalence  classes  following  the  as¬ 
signment.  Therefore,  comparing  the  equivalence  partitions 
can  demonstrate  differences  between  the  subject  and  the 
mutation  . 

Consider  the  mutation 

A  =  B+C  (partition  =  A;B+C)  ==>  A=B-C  (partition  =  A;B-C) 

Comparing  the  partitions  shows  that  A  has  a  different  value 
in  the  two  programs. 

The  same  ideas  are  used  to  show  equivalence.  If  a 
mutation  has  changed  part  of  expression  E  to  an  expression 
E'  but  E  and  E'  are  in  the  same  equivalence  class,  then  the 
mutant  is  equivalent. 

8.4  Loop  Invariants.  Another  common  transformation 
removes  code  from  inside  loops  if  the  execution  of  that  code 
does  not  depend  on  the  iteration  of  the  loop.  Since  many 
mutations  change  the  boundaries  of  loops  techniques  for 
recognizing  this  invariance  is  useful  for  detecting 
equivalent  mutants.  In  those  cases  where  the  mutation 
either  increases  or  decreases  the  code  within  a  loop,  loop 
invariant  recognition  can  be  used  to  decide  whether  or  not 
the  effect  of  the  loop  is  changed.  In  the  following 
mutation,  excess  code  is  brought  within  the  scope  of  the  DO 
statement  . 
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DO  1  1=1,10 

=  =  > 

DO  2  1=1 

A( I  )  =0 

A( I  )  =0 

1 

CONTINUE 

1 

CONTINUE 

2 

B  =  0 

2 

B  =  0 

Since  the  assignment  B=0  is  loop  invariant,  it  does  not  mat¬ 
ter  how  many  times  it  is  executed. 

8.5  Hoisting  and  Sinking.  Hoisting  and  sinking  is  a 
form  of  code  removal  from  loops  in  which  code  which  will  be 
repeatedly  executed  is  moved  to  a  point  where  it  will  be 
executed  only  once;  this  is  accomplished  by  a  calculus  which 
gives  strict  conditions  on  when  a  block  of  code  can  be  moved 
up  (hoisted)  or  down  (sunk). 

The  applications  for  equivalence  testing  are  similar  to 
the  applications  for  loop  invariants.  The  major  difference 
is  that  hoisting  and  sinking  applies  to  cases  in  which  code 
is  included  or  excluded  along  an  execution  path  by  branching 
changes.  These  are  the  sorts  of  changes  obtained  by  GOTO 
replacement  and  statement  deletion  mutations.  In  these 
cases,  we  get  equivalence  if  the  added  or  deleted  code  can 
be  hoisted  or  sunk  out  of  the  block  involved  in  the  addition 
or  deletion. 

An  example  will  illustrate. 


2 

1 

3 


IF(A.EQ.0)GOTO1 

=  =  > 

IF(A.EQ.O) GOTO  2 

A  =  A  +  1 

A  =  A  +  1 

B  =  0 

2 

B  =  0 

GO  TO  3 

ro 

O 

t-* 

C 

B  =  0 

1 

B  =  0 

. 

3 

. 

In  this  example  B  is  set  to  0  regardless  of  whether  it 
is  assigned  its  value  at  line  1  or  at  line  2.  The  assig¬ 
nment  to  B  can  be  hoisted  as  follows: 


B  =  0 

I F ( A  .  EQ . 0 ) G 0  TO  3 
A  =  A  +  1 

3 

Since  both  programs  are  thus  transformed,  they  are 
equivalent  . 

8.6  Dead  Code.  Dead  Code  detection  is  geared  toward 
identifying  sections  of  code  which  cannot  be  executed  or 
whose  execution  has  no  effect.  Dead  code  algorithms  exist 
for  detecting  several  varieties  of  dead  code  situations.  We 
have  already  used  dead  code  analysis  as  a  subproblem  in  the 
propagation  problems  above.  Dead  code  analysis  is  also 
useful  to  directly  test  equivalence,  particularly  for  those 
mutations  arising  from  an  alteration  of  control  flow. 

A  typical  application  is  to  analyze  the  program  flow- 
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graphs.  If,  for  example,  a  mutation  disconnects  the  graph 
and  neither  connected  component  consists  entirely  of  dead 
statements,  then  the  mutant  cannot  be  equivalent.  Such 
disconnection  is  possible  by  the  mutant  which  inserts 
RETURNS  in  Fortran  subroutines. 

Another  common  situation  involves  applying  mutations  to 
sites  in  a  program  which  are  themselves  dead  code;  this  is 
the  classical  compiler  code  optimization  problem:  we  must 
detect  dead  code  since  any  mutations  applied  to  it  are 
equivalent  . 

Dead  code  analysis  can  also  be  used  to  show 
nonequivalence  by  using  it  to  demonstrate  that  a  mutation 
has  "killed"  a  block  of  code. 

8.7  Postprocessing  the  Mutants.  Optimizing  trans¬ 
formations  can  be  implemented  as  a  postprocessor  to  a  muta¬ 
tion  system.  User  experience  is  that  it  is  relatively  easy 
to  kill  as  may  as  90$  of  the  live  mutants.  To  the  remaining 
10$,  an  equivalence  heuristic  such  as  the  rules  sketched 
above  can  be  applied.  A  more  complete  description  of  such  a 
postprocessor  is  available  in  [BaS]. 

The  difficulty  of  judging  equivalent  mutants  from  those 
remaining  after  the  postprocessing  stage  both  helps  and  hin¬ 
ders  the  testing  process.  On  one  hand,  forcing  testers  and 
programmers  to  "sign  off"  on  equivalent  mutants  enforces  a 
unique  sort  of  accountability  in  the  testing  phase  of 
program  development  (see  Section  9).  On  the  other  hand, 
particularly  clever  programming  leads  to  many  equivalent 
mutants  whose  equivalence  is  rather  a  nuisance  to  judge; 
carelessness  for  these  programs  may  lead  to  error  proneness. 
Our  experience,  however,  is  that  production  programs  present 
no  special  difficulties  in  this  regard. 

9.  FURTHER  APPLICATIONS  OF  MUTATION 

9.1  Programming  Tool.  A  tester  specifies  to  an 
automatic  mutation  system: 

(1) .  a  program, 

(2) .  test  data, 

(3) .  a  list  of  error  operators  to  be  applied. 

The  system  generates  and  executes  the  required  mutants  on 
the  test  data,  "killing"  those  which  are  judged  incorrect 
vis  a  vis  the  execution  of  the  subject  program.  The  system 
also  produces  reports  which  the  user  may  examine  and  use  in 
subsequent  attempts  to  eliminate  mutants.  This  cycle  may  be 
viewed  as  a  series  of  interactive  sessions  in  which  the  user 
plays  the  role  of  an  advocate  who  defends  the  program  and 
the  system  plays  the  role  of  an  adversary  which  asks 
questions  of  the  form:  why  does  your  test  data  not 
distinguish  this  simple  error? 

If  the  mutation  system  also  provides  the  user  a 
pleasant  runtime  environment  in  which  to  write  programs,  the 
ad v oc a t e- ad v er s a r y  relationship  can  be  used  to  add  an  im¬ 
portant  dimension  to  the  process  of  programming.  Two  of  us 
have  argued  for  the  importance  of  "social"  filters  in  the 
creative  process  [ D L P  ] ;  mutation  analysis  applied  during  the 
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program  design  stage  can  be  used  to  simulate  an  essential 
social  process.  We  have  observed  in  our  own  programming  ef¬ 
forts  ind  in  the  reported  efforts  of  others,  a  tendency  to 
communicate  programs  to  others  --  obviously  the  act  of  ver- 
b .  1 1  i  z  i  n  »'  ideas  that  had  previously  existed  ethereally  has  a 
way  ■'  ei  ting  our  intuitions  (teachers  have  noted  this 
phenomenon  also).  The  typical  exchange  involves  the 

programmer  and  a  friendly  but  skeptical  observer.  As  the 
programmer  explains  his  code,  the  observer  (even,  if  as  is 
usually  the  case,  tie  does  not  really  understand  the  code) 
asks  the  sort  of  questions  that  one  expects  from  a  minimally 
attentive  audience:  "Why  is  that  inequality  not  strict?", 

"Is  that  the  same  variable  used  at  line  30?"  In  response  to 
each  such  question,  the  programmer  is  forced  to  re-examine  a 
fixed  line  of  code  and  meet  the  objection  --  he  either 
justifies  his  decision  to  the  observer,  uncovers  an  error, 
or  must  admit  that  he  really  does  not  understand  why  the 
choice  was  made.  In  all  three  cases,  the  programmer 
receives  valuable  feedback  that  he  is  unlikely  to  have 
deduced  by  introspective  analysis  of  the  program. 

The  adversary  role  of  a  mutation  system  always  forces 
the  user  into  a  careful  and  detailed  review  of  his  program 
and  the  design  decisions  made  in  constructing  it.  The 
mutations  are  like  the  minimally  attentive  observer  who  now 
and  then  chimes  in  with:  "I  don't  believe  that  --  justify 
it!"  Since  it  is  a  controlled  form  of  "pointing"  at  the 
code  which  requires  substantial  cooperation  from  the  user 
(his  justification  is  a  test  case)  such  interactive  use  does 
in  fact  simulate  an  important  aspect  of  the  social  process. 

9.2  Project  Management.  Of  the  emerging  approaches  to 
software  design,  implementation  and  debugging  --  however 
helpful  they  may  be  to  programmers  and  local  managers  — 
there  are  few  that  can  be  utilized  throughout  the  project 
management  hierarchy.  Structured  methods,  program  verifica¬ 
tion  and  restricted  modularization  are  essentially 
qualitative,  not  quantitative,  and  managers  should  not  be 
expected  to  understand  the  qualitative  basis  for  the  low- 
level  decisions. 

In  addition  to  their  primary  function  as  evaluators  of 
test  data,  mutation  systems  record  a  great  deal  of  informa¬ 
tion  which  can  be  used  to  influence  decision-making 
throughout  the  project  hierarchy.  Various  management- 
oriented  repackagings  of  the  information  relating  to  mutant 
failure  percentages  for  each  module  (indicating  how  close 
the  software  is  to  being  acceptable),  who  has  r es po n s i b i 1  i  t y 
for  classifying  which  mutants  as  equivalent,  and  which 
mutants  have  yet  to  fail  project  management  can: 

(1)  reassign  personnel  to  work  on  modules  with 

low  mutant  failure  rates, 

(2)  pinpoint  responsibility  for  modules  which  fail 

after  acceptance, 

(3)  use  audits  to  force  justification  of  why 

equivalent  mutants  exist, 

( 9 )  monitor  adherence  to  project  PERT  charts,  and 
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(5)  offer  rewards  and  incentives  to  programmers 
who  achieve  high  mutant  failure  rates. 

Obviously,  the  information  reported  to  managers  varies 
with  the  level  of  the  manager,  but  a  safe  rule  of  thumb  is 
that  the  higher  in  the  organization  a  request  for  informa¬ 
tion  originates,  the  less  detailed  is  the  expected  response. 
Project  Manager's  Report 

The  project  manager  periodically  meets  with  the  chief 
programmers  to  evaluate  the  project's  status.  Also,  the  as¬ 
signment  of  personnel  and  evaluation  of  personnel  per¬ 
formance  are  carried  out  at  this  level.  A  useful  report  for 
a  manager  at  this  level  will  contain: 

(1)  the  name  of  each  module 

(2)  the  chief  programmer  responsible  for  each  module 

(3)  plots  of  mutant  eliminations  vs.  time  for  each 
major  submodule 

(4)  summary  statistics  such  as  number  and  percentage 
of  equivalent  mutants, 

(5)  the  number  and  type  of  personnel  assigned  to 
each  major  submodule 

Chief  Programmer's  Report 

A  chief  programmer  should  be  familiar  with  the  actual 
coding  of  each  submodule,  although  he  is  not  always  directly 
involved  in  the  coding  effort.  He  meets  daily  with  his 
team.  The  type  of  information  needed  by  the  chief  program¬ 
mer  would  certainly  encompass: 

(1)-(5)  for  the  project  manager 

(6)  listings  of  equivalent  mutants 

(7)  logs  assigning  responsibility  for  classifying 

mutants  as  equivalent. 

In  addition  to  the  goals  outlined  above,  this  informa¬ 
tion  has  the  effect  of  suggesting  possible  additional  mutant 
operators  for  a  given  submodule.  Notice  that  the  chief 
programmer  assumes  the  responsibility  for  asking  a  program¬ 
mer  to  justify  mutant  equivalence;  assuming  a  postprocessor 
such  as  the  one  described  in  Section  8,  these  equivalent 
mutants  should  be  largely  non-trivial  equivalences.  A  chief 
programmer  may  want  to  know  for  instance  why  it  does  not 
matter  if  a  certain  variable  name  can  be  changed  without  ef¬ 
fect  on  the  submodule,  why  the  module  is  so  insensitive  to 
the  mutation. 

In  the  last  analysis,  it  will  be  the  chief  programmer 
who  determines  that  a  given  submodule  has  been  acceptably 
tested  and  who  will  prepare  evidence  supporting  his  decision 
for  the  project  manager. 

Programmer's  and  Tester's  Report 

With  the  exception  of  the  personnel  reports,  the 
programmer  has  access  to  all  of  the  information  supplied  to 
the  levels  above  him.  He  also  has  access  to  all  listings, 
so  can  use  the  reporting  mechanism  to  augment  test  data, 
augment  mutant  operators,  classify  equivalent  mutants,  and 
determine  the  adequacy  of  the  test,  all  as  described  above. 

9-3  Acceptance  and  Certification.  The  degree  to  which 
one  has  confidence  in  the  competent  programmer  hypothesis 
and  the  coupling  effect  for  the  given  set  of  mutant 
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operators  determines  the  confide r  e  that  the  mutant  elimina¬ 
tion  percentage  reflects  the  freeness  of  a  program. 
However,  in  the  absence  of  strong  information  in  this 
regard,  mutation  analysis  is  an  objective  ranking  device. 
Low  elimination  percentages  are  less  desirable  than  high 
elimination  percentages;  furthermore,  even  though  the  boun¬ 
dary  may  be  rather  fuzzy,  it  is  rather  easy  to  reject  ob¬ 
viously  inadequate  test  data  sets.  This  observation  coupled 
with  the  fact  that  if  all  that  is  desired  is  an  indication 
of  the  strength  of  a  previously  produced  test  data  set  then 
virtually  no  human  interaction  is  required  to  produce  the 
analysis  leads  one  to  consider  the  use  of  mutation  analysis 
for  software  procurement  testing. 

Since  acceptance  testing  should  be  the  final  stage  of 
the  development  process,  a  buyer  can  specify  at  what  point 
the  testing  begins.  Assuming  that  the  developer  is  using 
testing  technqiues  with  the  sensitivity  of  mutation,  the 
buyer  can  monitor  progress.  To  evaluate  the  delivered 
software  (or  advertised  software  in  the  increasingly  active 
mail  market  for  small  system  software),  one  may  specify 
contractually  that  the  developer  must  present  a  convincing 
case  that  he  is  not  delivering  "rigged"  tests  --  one  way  of 
doing  this  is  to  specify  a  minimal  mutant  elimination  per¬ 
centage.  Many  options  ensue.  Software  not  passing  this 
minimal  certification  may  be  rejected  with  significant 
financial  penalty  to  the  developer.  In  this  case  it  is  not 
essential  that  the  developer  use  a  mutation  system  to 
develop  the  tests.  It  is  important  to  note  that  no  more 
significance  should  be  attatched  to  the  level  of  performance 
required  for  acceptance  than  for,  say  the  third-party  test¬ 
ing  of  refrigerators  by  a  well-known  certifying  or¬ 
ganization;  the  certification  merely  establishes  the 
likelihood  that  the  developer  has  spent  considerable  effort 
in  testing  his  software.  Thereafter,  the  buyer's  confidence 
will  more  likely  be  affected  by  nontechnical  issues,  such  as 
the  developers  performance  on  similar  projects. 


10.  CONCLUDING  REMARKS 

A  program  passes  a  mutation  test  with  a  set  of  test 
data  D  if  it  behave  correctly  on  D  and  each  mutant  either 
fails  to  work  as  specified  or  is  equivalent  to  the  program. 
When  a  program  passes  such  a  test,  we  are  sure  that  it  is 
free  from  simple  errors.  In  order  to  insure  that  such  a 
program  is  also  free  from  complex  errors,  one  must  appeal  to 
an  empirical  principle  called  the  coupling  effect  which 
states  that  such  a  set  of  test  data  is  so  sensitive  that 
non-equivalent  (complex)  mutants  are  also  likely  to  fail  on 
D.  The  conceptual  justification  for  the  coupling  effect 
parallels  the  probabilistic  arguments  used  to  justify  the 
single  fault  methods  used  to  test  logic  circuits  [Chang]. 
We  have  presented  a  combination  of  empirical  evidence  and 
plausibility  arguments  in  support  of  the  coupling  effect. 
This  leads  to  the  metatheorem  of  mutation  analysis: 

If  P  passes  mutation  analysis  then  either 
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(1)  P  is  correct,  or 

(2)  P  is  radically  incorrect. 


The  Competent  Programmer  Hypothesis  states 
perienced  programmers  tend  to  write  programs  that 
from  correct  ones  by  simple  errors  and  hence  possibi 
of  the  metatheorem  is  rather  unlikely. 

In  order  that  the  mutation  analysis  tech n 
feasible,  it  is  necessary  that: 

(1)  the  set  of  simple  mutants  be  small, 

(2)  errors  be  reliably  detected  by  the  analysis 

(3)  the  question  of  equivalence  be  reducible  to 
a  small  sub  problem. 

In  the  foregoing,  we  have  presented  our 
knowledge  with  regard  to  these  issues.  Our  ex  peri 
been  encouraging.  Even  if  t h e  goals  of  mutation 
are  rather  more  optimistic  than  is  warranted,  t h e  b a 
modelling  strategy  is  emerging;  it  appears  that  it 
sible  to  generate  testable  hypotheses  concerni 
programming  process.  We  can  only  hope  that  future 
by  us  and  others  will  shed  some  light  on  this  fa  sc 
important,  but  little  understood,  activity. 
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APPENDIX 


CMS. 1  SESSION  SCRIPT 


WELCOME  TO  THE  COBOL  PILOT  MUTATION  SYSTEM 

PLEASE  ENTER  THE  NAME  OF  THE  COBOL  PROGRAM  FILE : >LOG -CHANGES 
DO  YOU  WANT  TO  PURGE  WORKING  FILES  FOR  A  FRESH  RUN  ?>YES 
PARSING  PROGRAM 
SAVING  INTERNAL  FORM 

WHAT  PERCENTAGE  OF  MUTANTS  DO  YOU  WANT  TO  CREATE?>100 
CREATING  MUTANT  DESCRIPTOR  RECORDS 
PRE-RUN  PHASE 

DO  YOU  WANT  TO  SUBMIT  A  TEST  CASE  ?  >PROGRAM 


1  IDENTIFICATION  DIVISION. 

2 

PROGRAM-ID.  POQAACA . 

3 

AUTHOR.  CPT  R  W  MOREHEAD. 

4 

INSTALLATION.  HQS  USACSC. 

5 

DATE-WRITTEN.  OCT  1973. 

6 

REMARKS. 

7 

THIS  PROGRAM  PRINTS  OUT  A  LIST 

OF  CHANGES  IN  THE  ETF. 

8 

ALL  ETF  CHANGES  WERE  PROCESSED 

PRIOR  TO  THIS  PROGRAM.  THE 

9 

OLD  ETF  AND  THE  NEW  ETF  ARE  THE  INPUTS.  BUT  THERE  IS  NO 

10 

FURTHER  PROCESSING  OF  THE  ETF 

HERE.  THE  ONLY  OUTPUT  IS  A 

11 

LISTING  OF  THE  ADDS,  CHANGES, 

AND  DELETES.  THIS  PROGRAM  I: 

12 

FOR  HQ  USE  ONLY  AND  HAS  NO  APPLICATION  IN  THE  FIELD. 

13 

*ft*****ft***ft***»» 

14 

MODIFIED  FOR  TESTING  UNDER  CPMS  BY  ALLEN  ACREE 

15 

JULY,  1979. 

16 

ENVIRONMENT  DIVISION. 

17 

CONFIGURATION  SECTION. 

18 

SOURCE-COMPUTER.  PRIME. 

19 

OBJECT-COMPUTER.  PRIME. 

20 

INPUT -OUTPUT  SECTION. 

21 

FILE-CONTROL. 

22 

SELECT  OLD-ETF  ASSIGN  INPUT4. 

23 

SELECT  NEW-ETF  ASSIGN  INPUTS. 

24 

SELECT  PRNTR  ASSIGN  TO  OUT  PUT 9 

. 

25 

DATA  DIVISION. 

26 

FILE  SECTION. 

27 

FD  OLD-ETF 

28 

RECORD  CONTAINS  80  CHARACTERS 

29 

LABEL  RECORDS  ARE  STANDARD 

30 

DATA  RECORD  IS  OLD-REC. 

31 

01  OLD-REC. 

32 

03  FILLER 

PIC  X. 

33 

03  OLD-KEY 

PIC  X  ( 1 2 )  . 

34 

03  FILLER 

PIC  X(67 ) . 

35 

FD  NEW-ETF 

36 

RECORD  CONTAINS  80  CHARACTERS 

37 

LABEL  RECORDS  ARE  STANDARD 

38 

DATA  RECORD  IS  NEW-REC. 

39 

01  NEW-REC. 

40 

03  FILLER 

PIC  X. 

41 

03  NEW-KEY 

PIC  X ( 12)  . 
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PIC  X(67) 


42 

03 

FILLER 

PIC 

43 

FD 

PRNTR 

44 

RECORD  CONTAINS 

40  CHARACTERS 

45 

LABEL  RECORDS  ARE  OMITTED 

46 

DATA 

,  RECORD  IS  PKNT-LINE. 

47 

01 

PR  NT 

-LINE 

PIC 

48 

WORKING- 

■STORAGE  SECTION. 

49 

01 

PRNT 

-WORK -AREA. 

50 

03 

LINE1 

PIC 

51 

03 

LINE2 

PIC 

52 

03 

LINE3 

PIC 

53 

01 

PRNT 

'-OUT-OLD. 

54 

03 

WS-LN-1 . 

55 

05  FILLER 

PIC 

56 

05  FILLER 

PIC 

57 

05  LN 1 

PIC 

58 

05  FILLER 

PIC 

59 

03 

WS-LN-2 . 

60 

05  FILLER 

PIC 

61 

05  FILLER 

PIC 

62 

05  LN2 

PIC 

63 

05  FILLER 

PIC 

64 

03 

WS-LN-3 . 

65 

05  FILLER 

PIC 

66 

05  FILLER 

PIC 

67 

05  LN3 

PIC 

68 

05  FILLER 

PIC 

69 

01 

PRNT 

-NEW-OUT. 

70 

03 

NEW-LN-1 . 

71 

05  FILLER 

PIC 

72 

05  N-LN1 

PIC 

73 

05  FILLER 

PIC 

74 

03 

NEW-LN-2. 

75 

05  FILLER 

PIC 

76 

05  N-LN2 

PIC 

77 

05  FILLER 

PIC 

78 

03 

NEW-LN-3 . 

79 

05  FILLER 

PIC 

80 

05  N-LN3 

PIC 

81 

05  FILLER 

PIC 

82 

PROCEDURE  DIVISION. 

33 

0100 

-OPENS. 

84 

OPEN 

INPUT  OLD-ETF  NEW-ETF. 

85 

OPEN 

OUTPUT  PRNTR. 

86 

0110 

-OLD 

-READ. 

87 

READ 

1  OLD-ETF  AT 

END  GO  TO  01 60-0LD-E0F . 

88 

0120 

'-NEW 

-READ. 

89 

READ 

NEW-ETF  AT 

END  GO  TO  0170-NEW-EOF. 

90 

0130 

'-COMPARES. 

91 

IF  OLD-KEY  =  NEW 

-KEY 

92 

NEXT  SENTENCE 

93 

ELSE 

GO  TO  0140- 

CK-ADD-DEL. 

94 

IF  OLD-REC  =  NEW 

-REC 

95 

GO  TO  01 10-OLD-READ. 

96 

MOVE 

OLD-REC  TO 

PRNT-WORK-AREA . 

97 

PERFORM  02 10-OLD 

-WRT  THRU  0210-EXIT. 

X(40)  . 


X(30) . 
X(30)  . 
X(20). 


X  VALUE  SPACE. 
XXXX  VALUE  '0  ' 

X(30). 

XXX  VALUE  SPACES. 

X  VALUE  SPACE. 
XXXX  VALUE  'L  ' 

X(30). 

XXX  VALUE  SPACES. 

X  VALUE  SPACE. 
XXXX  VALUE  'D  ' 
X(20) . 

XXX  VALUE  SPACE. 


XXXXX  VALUE  '  N  ' 

X (30)  . 

XXX  VALUE  SPACE. 
XXXXX  VALUE  '  E  ' 

X (30) . 

XXX  VALUE  SPACES. 

XXXXX  VALUE  '  W  ' 

X (20)  . 

XXX  VALUE  SPACES. 


* 

i 
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98  MOVE  NEW-REC  TO  PRNT-WORK-AREA. 

99  PERFORM  0200-NW-WRT  THRU  0200-EXIT. 

100  GO  TO  01 10-OLD-READ. 

101  0140-CK -ADD-DEL. 

102  IF  OLD-KEY  >  NEW-KEY 

103  MOVE  NEW-REC  TO  PRNT-WORK-AREA 

104  PERFORM  0200-NW-WRT  THRU  0200-EXIT 

105  GO  TO  0 1 20-NEW-READ 

106  ELSE  GO  TO  0150-CK-ADD-DEL. 

107  0150-CK-ADD-DEL. 

108  MOVE  OLD-REC  TO  PRNT-WORK-AREA. 

109  PERFORM  0210-OLD-WRT  THRU  0210-EXIT. 

110  READ  OLD-ETF  AT  END 

111  MOVE  NEW-REC  TO  PRNT-WORK-AREA 

112  PERFORM  0200-NW-WRT  THRU  0200-EXIT 

113  GO  TO  0160-OLD-EOF. 

114  GO  TO  0130-COMPARES. 

115  0160-OLD-EOF. 

116  READ  NEW-ETF  AT  END  GO  TO  0180-EOJ. 

117  MOVE  NEW-REC  TO  PRNT-WORK-AREA. 

118  PERFORM  0200-NW-WRT  THRU  0200-EXIT. 

119  GO  TO  0160-OLD-EOF. 

120  0170-NEW-EOF. 

121  MOVE  OLD-REC  TO  PRNT-WORK-AREA. 

122  PERFORM  0210-OLD-WRT  THRU  0210-EXIT. 

123  READ  OLD-ETF  AT  END  GO  TO  0180-EOJ. 

124  GO  TO  0170-NEW-EOF. 

125  0180-EOJ. 

126  CLOSE  OLD-ETF  NEW-ETF  PRNTR . 

127  STOP  RUN. 

128  0200-NW-WRT. 

129  MOVE  LINE1  TO  N-LN1. 

130  MOVE  LINE2  TO  N-LN2. 

131  MOVE  LINE3  TO  N-LN3. 

132  WRITE  PRNT-LINE  FROM  NEW-LN-1  AFTER  ADVANCING  2. 

133  WRITE  PRNT-LINE  FROM  NEW-LN-2  AFTER  ADVANCING  1. 

134  WRITE  PRNT-LINE  FROM  NEW-LN-3  AFTER  ADVANCING  1. 

135  0200-EXIT. 

136  EXIT. 

137  0210-OLD-WRT. 

138  MOVE  LINE1  TO  LN 1 . 

139  MOVE  LINE2  TO  LN2. 

140  MOVE  LINE3  TO  LN3. 

141  WRITE  PRNT-LINE  FROM  WS-LN-1  AFTER  ADVANCING  2. 

142  WRITE  PRNT-LINE  FROM  WS-LN-2  AFTER  ADVANCING  1. 

143  WRITE  PRNT-LINE  FROM  WS-LN-3  AFTER  ADVANCING  1. 

144  0210-EXIT. 

145  EXIT. 

>YES 

A  test  case  for  this  program  is  a  pair  of  input 
files.  In  CMS.1  these  may  be  created  outside  the 
system  and  referenced  by  name,  or  may  be  entered  "on 
the  f 1 y"  . 

WHERE  IS  OLD-ETF? 
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>LC9 

WHERE  IS  NEW-ETF? 

>LC6 

OLD-ETF  AS  USED  BY  THE  PROGRAM 

1 12.14567890 121 1 1 1 1 1 1 1 1 IOJ J J J J JJJ JKKKKKKKKKKLLLLLLLLLLNNNNNNNNNNBBBBBBBBBBGGGGG 
J234567890123YYYYYYYYYYGGGGGGGGGGFFFFFFFFFF0DDDDDDDDDSSSSSSSSSSXXXXXXXXXXEEEEE 

NEW-ETF  AS  USED  BY  THE  PROGRAM 

113345678901 200000000000000000000000000000000000000000000000000000000000000000 
J2 34567890 1 23YYYYYYYY YYGGGGGGGGGGFFFFFFFFFFDDDDDDDDDDSSSSSSSSSSXXXXXXXXXXEEEEE 
34567890 1234UUUUUUUUUUHHHHHHHHHHGGGGGGGGGGDDDDDDDDDDSSSSSSSSSSEEEEEEEEEEAA AAA 

PRNTR  AS  USED  BY  THE  PROGRAM 

0  1 1234567890 121  III  III IIIOJ JJJ JJ 
L  J J JKKKKKKKKKKLLLLLLLLLLNNNNNNN 
D  NNNBBBBBBBBBBGGGGGGG 

N  1133456789012 00000000000000000 
E  000000000000000000000000000000 
W  00000000000000000000 

0  J234567S90123YYYYYYYYYYGGGGGGG 
L  GGGFFFFFFFFFFODDDDDDDDDSSSSSSS 
D  SSSXXXXXXXXXXEEEEEEE 

N  J2 345678901 23YYYYYYYYYYGGGGGGG 
E  GGGFFFFFFFFFFODDDDDDDDDSSSSSSS 
W  SSSXXXXXXXXXXEEEEEEE 

N  34567890 12 34UUUUUUUUUUHHHHHHH 
E  HHHGGGGGGGGGGDDDDDDDDDDSSSSSSS 
W  SSSEEEEEEEEEEAAAAAAA 

THE  PROGRAM  TOOK  84  STEPS 
IS  THIS  TEST  CASE  ACCEPTABLE  ?  >YES 
DO  YOU  WANT  TO  SUBMIT  A  TEST  CASE  ?  >N0 
MUTATION  PHASE 

WHAT  NEW  MUTANT  TYPES  ARE  TO  BE  CONSIDERED  ?  >SELECT 

ENTER  THE  NUMBERS  OF  THE  MUTANT  TYPES  YOU  WANT  TO  TURN  ON  AT  THIS  TIME. 

4  »**»  INSERT  FILLER  TYPE  *»*» 

5  *«»*  FILLER  SIZE  ALTERATION  TYPE  »»** 

6  **»»  ELEMENTARY  ITEM  REVERSAL  TYPE  **** 

7  »»•*  FILE  REFERENCE  ALTERATION  TYPE  •*** 

8  ****  STATEMENT  DELETION  TYPE  **»* 

10  ***»  PERFORM  — >  GO  TO  TYPE  »»»* 

11  *»*«  THEN  -  ELSE  REVERSAL  TYPE  »“* 

12  *«*«  STOP  STATEMENT  SUBSTITUTION  TYPE  »**• 

13  ****  THRU  CLAUSE  EXTENSION  TYPE  ***• 

14  »***  TRAP  STATEMENT  REPLACEMENT  TYPE  **»» 

20  »*»*  LOGICAL  OPERATOR  REPLACEMENT  TYPE  ***» 

21  *»»»  SCALAR  FOR  SCALAR  REPLACEMENT  **** 
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i 


22 

ftftftft 

CONSTANT 

FOR  CONSTANT  REPLACEMENT 

23 

ft  ft  *# 

CONSTANT 

FOR  SCALAR  REPLACEMENT 

25 

ftftftft 

CONSTANT 

ADJUSTMENT  **** 

TYPES  ? 

>  4 

TO  14 

STOP 

MUTANT 

TYPE 

STATUS 

TOTAL 

LIVE 

PCT 

INSERT 

41 

7 

82.93 

FILLSZ 

38 

14 

63. 16 

ITEMRV 

21 

0 

100.00 

FILES 

5 

1 

80.00 

DELETE 

54 

13 

75.93 

PER  GO 

7 

2 

71.43 

IF  REV 

3 

1 

66.67 

STOP 

53 

10 

81 .13 

THRU 

8 

2 

75.00 

TRAP 

54 

10 

81.48 

TOTALS 

DO  YOU 

WANT 

284 

TO  SEE 

60 

THE  LIVE 

78.87 

MUTANTS?>NO 

LOOP  OR  HALT  ?  >LOOP 
PRE-RUN  PHASE 

DO  YOU  WANT  TO  SUBMIT  A  TEST  CASE  ?  >YES 
WHERE  IS  OLD-ETF? 

>LC  15 

WHERE  IS  NEW-ETF? 

>LC5 

OLD-ETF  AS  USED  BY  THE  PROGRAM 

OOOOOOOOOOO 12III IIIIIII JJJJJJJJJJKKKKKKKKKKLLLLLLLLLLNNNNNNNNNNBBBBBBBBBBGGGGG 
1 1234567890 12IIIIIIIIIIJJJJJJJJJJKKKKKKKKKKLLLLLLLLLLNNNNNNNNNNBBBBBBBBBSGGGGG 
J 234567890 123YYYYYYYYYYGGGGGGGGGGFFFFFFFFFFDDDDDDDDDDSSSSSSSSSSXXXXXXXXXXEEEEE 

NEW-ETF  AS  USED  BY  THE  PROGRAM 

I 123456789012IIIIIII III JJ J J JJ J JJJKKKKKKKKKKLLLLLLLLLLNNNNNNNNNNBBBBBBBBBBGGGGG 
J234567890123YYYYYYYYYYGGGGGGGGGGFFFFFFFFFFDDDDDDDDDDSSSSSSSSSSXXXXXXXXXXEEEEE 

PRNTR  AS  USED  BY  THE  PROGRAM 

0  0000000000012IIIIIIIIIIJJJJJJJ 
L  JJJKKKKKKKKKKLLLLLLLLLLNNNNNNN 
D  NNNBBBBBBBBBBGGGGGGG 

THE  PROGRAM  TOOK  44  STEPS 
IS  THIS  TEST  CASE  ACCEPTABLE  ?  >YES 
DO  YOU  WANT  TO  SUBMIT  A  TEST  CASE  ?  >YES 
WHERE  IS  OLD-ETF? 

>LC  1 4 

WHERE  IS  NEW-ETF? 

>LC5 

OLD-ETF  AS  USED  BY  THE  PROGRAM 
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1 1234567890 1211 1IIIIIIIKJJJJJJJJJKKKKKKKKKKLLLLLLLLLLNNNNNNNNNNBBBBBBBBBBGGGGG 
J2 345678 901 2 3YYYYYYYYYYGGGGGGGGGGFFFFFFFFFFDDDDDDDDDDSSSSSSSSSSXXXXXXXXXXEEEEE 

NEW-ETF  AS  USED  BY  THE  PROGRAM 

1 1234567890121 II 1 1 II I II JJJJ JJJJJJKKKKKKKKKKLLLLLLLLLLNNNNNNNNNNBBBBBBBBBBGGGGG 
J 2345 67890 123YYYYYYYYYYGGGGGGGGGGFFFFFFFFFFDDDDDDDDDDSSSSSSSSSSXXXXXXXXXXEEEEE 

PRNTR  AS  USED  BY  THE  PROGRAM 

0  I  . '  34567890 1 2 1 1 1 1 1 1 1 1 1 1 K J J J J J J 
L  JJJKKKKKKKKKKLLLLLLLLLLNNNNNNN 
D  NNNBB3BBBBBBBGGGGGGG 

N  II23456739OI2IIIIIIIIIIJJJJJJJ 
E  JJJKKKKKKKKKKLLLLLLLLLLNNNNNNN 
W  NNNSBBSBBBBBBGGGGGGG 

THE  PROGRAM  TOOK  48  STEPS 
IS  THIS  TEST  CASE  ACCEPTABLE  ?  >YES 
DO  YOU  WANT  TO  SUBMIT  A  TEST  CASE  ?  >YES 
WHERE  IS  OLD-ETF? 

>LC  1 1 

WHERE  IS  NEW-ETF? 

>LC  1 

OLD-ETF  AS  USED  BY  THE  PROGRAM 
00000000000000000000000000000000000000000000 
NEW-ETF  AS  USED  BY  THE  PROGRAM 

1 1234567890 121 II I III  I II J J J J J  J J J J  JKKKKKKKKKKLLLLLLLLLLNNNNNNNNNNBBBBBBBBBBGGGGG 
J 2 34567 890 123YYYYYYYYYYGGGGGGGGGGFFFFFFFFFFDDDDDDDDDDSSSSSSSSSSXXXXXXXXXXEEEEE 
34567890 1 234UUUUUUUUUUHHHHHHHHHHGGGGGGGGGGDDDDDDDDDDSSSSSSSSSSEEEEEEEEEEA AAAA 

PRNTR  AS  USED  BY  THE  PROGRAM 

0  000000000000000000000000000000 
L  00000000000000 
D 

N  1 1234567890121 II IIIII 1 1 JJJ JJJJ 
E  JJJKKKKKKKKKKLLLLLLLLLLNNNNNNN 
W  NNNBBBBBBBBBBGGGGGGG 

N  J 2 34567 890 1 23YYYYYYYYYYGGGGGGG 
E  GGGFFFFFFFFFFDDDDDDDDDDSSSSSSS 
W  SSSXXXXXXXXXXF.EEEEEE 

N  34567890 1234UUUUUUUUUUHHHHHHH 
E  HHHGGGGGGGGGGDDDDDDDDDDSSSSSSS 
W  SSSEEEEEEEEEEAAAAAAA 

THE  PROGRAM  TOOK  64  STEPS 
IS  THIS  TEST  CASE  ACCEPTABLE  ?  >YES 
DO  YOU  WANT  TO  SUBMIT  A  TEST  CASE  ?  >YES 
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WHERE  l:j  OLP-ETE' 

>u:  i 

WIIKKK  IS  NEW-ETKV 
>LC  1  I 

OLP-ETF  AS  IISKP  BY  THK  PROGRAM 

1  1.'  >4807890  l.'l  1  1  1  1  111  1  1 .1.1  J  .1.1  .1 JJ  J  JKKKKKKKKKKl.l.l.LLl.l.l.l.l.NNNNNNNNNN 
.10  1480/800  1 >Y  Y Y Y Y Y Y  Y Y  YGG.GG.GGGGGGFFFFFFFFFF PBPPDOPBOPSSSSSSSSSS 
>48o7S90  l.'  >4UUUUUUUUUUHHHHHHHHHHGG.G.GGGGGGGPlM)niHMH)DPSSSSSSSSSS 

NKW-ETF  AS  IISKP  BY  THE  PROGRAM 

OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 

PR NTH  AS  USEB  BY  THE  PROGRAM 

N  OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 
F.  00000000000000 

w 

o  l  1084567890 101  l  III  III  l  IJ.I.IJJJJ 
1.  JJJKkkKKKKKKKI.l.l.Ll.LLLLl.NNNNNNN 
P  NflNBBBBBBBBBBGGiGGGGG 

0  JO  OP. 67800  10  1 Y  Y  Y  Y  Y  Y  Y  Y  Y  YGGGGGiGG 
1.  GGGFFFFFFFFFFPPPnPPPPPPSSSSSSS 
0  SSS  X  X  X  X  X  X  X  XX  X  E  E  E  E  E  E  E 

0  >4807840  10  14UUUUUUUUUUHHHHH1IH 

1.  HHHGGGGOiOiOiGOiGPnPPPPPPPPSSSSSSS 
P  SSSE  E  E  E  E  E  E  E  E  E  A  A  A  A  A  A  A 

THE  PROGRAM  TiX'K  64  STEPS 


IS  THIS 

TEST  OAS 

E  AOOI 

H’TABl.E 

*.* 

'YES, 

00  YOU  WANT  TO  S 

OBM  1  T 

A  TEST 

ga: 

5E 

MUTATION 

PHASE 

WHAT  NEW 

MUTANT 

TYPES 

ARE  TO 

BE 

OON 

MUTANT  S 

TATUS 

TYPE 

TOTAL 

LIVE 

Pi 

:t 

1 NSERT 

8  1 

> 

00 

.08 

FILLSZ 

18 

10 

08 

40 

ITEMRV 

01 

0 

100 

00 

Kll.ES 

1 . 

0 

100 

00 

DELETE 

■01 

! 

08 

18 

PER  OKI 

1 

J 

100 

00 

IF  REV 

0 

100 

00 

STOP 

s  ^ 

0 

1 00 

00 

THRU 

8 

0 

100 

00 

TRAP 

‘,'1 

0 

100 

00 

l.iXI  1 0 

15 

1 

0  1 

1  1 

SOUSE'S 

814 

00 

07 

‘,4 

0. 11  BOEV 

10 

0 

100 

00 

GOUGES 

‘>8 

l) 

100 

00 

0  A  0.1 

10 

0 

100 

00 

BBBBBBBHP'BGGGGG 
XXXXXXXXXXEEE EE 
EEEEEEEEEEAAAA A 
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TOTALS 


1 1 9‘>  37  96.90 

DO  YOU  WANT  TO  SEE  THE  LIVE  MUTANTS? >YES 
FOR  EACH  MUTANT  : 

HIT  RETURN  TO  CONTINUE.  TYPE  'STOP'  TO  STOP. 
TYPE  'EUUIV  TO  JUDGE  THE  MUTANT  EQUIVALENT. 

»*»*  INSERT  FILLER  TYPE  ***» 

MUTANT  NUMBER  12 

A  FILLER  OF  LENGTH  ONE  HAS  BEEN  INSERTED  AFTER 
THE  ITEM  WHICH  STARTS  ON  LINE  52 
ITS  LEVEL  NUMBER  IS  3 


MUTANT  NUMBER  13 

A  FILLER  OF  LENGTH  ONE  HAS  BEEN  INSERTED  AFTER 
THE  ITEM  WHICH  STARTS  ON  LINE  53 
ITS  LEVEL  NUMBER  IS  3 


MUTANT  NUMBER  20 

A  FILLER  OF  LENGTH  ONE  HAS  BEEN  INSERTED  AFTER 
THE  ITEM  WHICH  STARTS  ON  LINE  69 
ITS  LEVEL  NUMBER  IS  I 


*»»»  FILLER  SIZE  ALTERATION  TYPE  **** 


MUTANT  NUMBER 

514 

THE  FILLER  ON 

LINE 

S8 

HAS 

HAD 

ITS 

SIZE 

DECREMENTED 

BY 

ONE 

MUTANT  NUMBER 

55 

THE  FILLER  ON 

LINE 

58 

HAS 

HAD 

ITS 

SIZE 

INCREMENTED 

BY 

ONE 

MUTANT  NUMBER 

60 

THE  FILLER  ON 

LINE 

63 

HAS 

HAD 

ITS 

SIZE 

DECREMENTED 

BY 

ONE 

MUTANT  NUMBER 

61 

THE  FILLER  ON 

LINE 

6  3 

HAS 

HAD 

ITS 

SIZE 

INCREMENTED 

BY 

ONE 

MUTANT  NUMBER 

66 

THE  FILLER  ON 

LINE 

68 

HAS 

HAD 

ITS 

SIZE 

DEC REM ENTEP 

BY 

ONE 

MUTANT  NUMBER 

67 

THE  FILLER  ON 

LINE 

68 

HAS 

HAD 

ITS 

SIZE 

INCREMENTED 

BY 

ONE 

MUTANT  NUMBER 

70 
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THE  FILLER  ON 

LINE 

73 

HAS 

HAD 

ITS 

SIZE 

DECREMENTED 

BY 

ONE 

MUTANT  NUMBER 

71 

THE  FILLER  ON 

LINE 

73 

HAS 

HAD 

ITS 

SIZE 

INCREMENTED 

BY 

ONE 

MUTANT  NUMBER 

?4 

THE  FILLER  ON 

LINE 

77 

HAS 

HAD 

ITS 

SIZE 

DECREMENTED 

BY 

ONE 

MUTANT  NUMBER 

75 

THE  FILLER  ON 

LINE 

77 

HAS 

HAD 

ITS 

SIZE 

INCREMENTED 

BY 

ONE 

MUTANT  NUMBER 

78 

THE  FILLER  ON 

LINE 

81 

HAS 

HAD 

ITS 

SIZE 

DECREMENTED 

BY 

ONE 

MUTANT  NUMBER 

79 

THE  FILLER  ON 

LINE 

81 

HAS 

HAD 

ITS 

SIZE 

INCREMENTED 

BY 

ONE 

> 


“**  STATEMENT  DELETION  TYPE  **** 

MUTANT  NUMBER  126 

ON  LINE  106  THE  STATEMENT: 

GO  TO  0150-CK -ADD-DEL 
HAS  BEEN  DELETED. 


> 


****  LOGICAL  OPERATOR  REPLACEMENT  TYPE  **** 

MUTANT  NUMBER  296 

ON  LINE  102  THE  STATEMENT: 

IF  OLD-KEY  >  NEW-KEY 
HAS  BEEN  CHANGED  TO: 

IF  OLD-KEY  NOT  <  NEW-KEY 


> 


****  SCALAR  FOR  SCALAR  REPLACEMENT  **** 

MUTANT  NUMBER  300 

ON  LINE  87  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  NEW-REC  AT  END  ... 


MUTANT  NUMBER  jOI 

ON  LINE  87  THE  STATEMENT : 

READ  OLD-ETF  AT  END  . . . 
HAS  BEEN  CHANGED  TO: 


R K A D  OLD-ETF  INTO  PR NT -WORK-AREA  AT  END 


> 

MUTANT  NUMBER  .III 

ON  LINE  39  THE  STATEMENT: 

READ  NEW-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  NEW-ETF  INTO  PRNT-WORK-AREA  AT  END 


MUTANT  NUMBER  629 

ON  LINE  110  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  PRNT-WORK-AREA  AT  END 


MUTANT  NUMBER  682 

ON  LINE  116  THE  STATEMENT: 

READ  NEW-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  NEW-ETF  INTO  OLD-REC  AT  END  ... 


MUTANT  NUMBER  683 

ON  LINE  116  THE  STATEMENT: 

READ  NEW-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  NEW-ETF  INTO  PRNT-WORK-AREA  AT  END 


MUTANT  NUMBER  684 

ON  LINE  116  THE  STATEMENT: 

READ  NEW-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  NEW-ETF  INTO  PRNT-OUT-OLD  AT  END  . . 


> 

MUTANT  NUMBER  686 

ON  LINE  116  THE  STATEMENT: 

READ  NEW-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  NEW-ETF  INTO  WS-LN-1  AT  END  ... 


MUTANT  NUMBER  686 

ON  LINE  116  THE  STATEMENT: 

READ  NEW-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  NEW-ETF  INTO  WS-LN-2  AT  END  ... 


MUTANT  NUMBER  687 

ON  LINE  116  THE  STATEMENT: 

READ  NEW-ETF  AT  END  ... 
HA:'.  BEEN  CHANGED  TO: 
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READ  NEW-ETF  INTO  WS-LN-3  AT  END  .. 


MUTANT  NUMBER  780 

ON  LINE  123  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  NEW-REC  AT  END  ... 


MUTANT  NUMBER  781 

ON  LINE  123  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  PRNT-WORK-AREA  AT  END 


MUTANT  NUMBER  786 

ON  LINE  123  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  PRNT-NEW-OUT  AT  END  .. 


MUTANT  NUMBER  787 

ON  LINE  123  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  NEW-LN-1  AT  END  ... 


MUTANT  NUMBER  788 

ON  LINE  123  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  NEW-LN-2  AT  END  ... 


MUTANT  NUMBER  789 

ON  LINE  123  THE  STATEMENT: 

READ  OLD-ETF  AT  END  ... 

HAS  BEEN  CHANGED  TO: 

READ  OLD-ETF  INTO  NEW-LN-3  AT  END  ... 

> 

MUTANT  NUMBER  819 

ON  LINE  129  THE  STATEMENT: 

MOVE  LINE  1  TO  N-LN1 
HAS  BEEN  CHANGED  TO: 

MOVE  NEW-REC  TO  N-LN1 

> 

MUTANT  NUMBER  817 

ON  LINE  129  THE  STATEMENT: 

MOVE  LINE  1  TO  N-LN  1 
HAS  BEEN  CHANGED  TO: 
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MOVE  PRNT -WORK-AREA  TO  N-LN1 


> 

MUTANT  NUMBER  974 

ON  LINE  1 33  THE  STATEMENT: 

MOVE  LINE  1  TO  LN 1 
HAS  BEEN  CHANGED  TO: 

MOVE  OLD-REC  TO  LN 1 


> 

MUTANT  NUMBER  979 

ON  LINE  138  THE  STATEMENT: 

MOVE  LINE  1  TO  LN 1 
HAS  BEEN  CHANGED  TO: 

MOVE  PRNT-WORK-AREA  TO  LN 1 


> 

LOOP  OR  HALT  ?  >HALT 
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0  ABSTRACT 

A  new  tvpe  of  software  test,  called  mutation  analysis,  is  introduced.  A  me¬ 
thod  of  applying  mutation  analysis  is  described,  and  the  design  of  several  exist 
ing  automated  systems  for  applying  mutation  analysis  to  Fortran  and  Cobol  pro¬ 
grams  is  sketched.  These  systems  have  been  the  means  for  preliminary  studies 
of  the  efficiency  of  mutation  analysis  and  of  the  relationship  between  mutation 
and  other  systematic  testing  techniques.  The  results  of  several  experiments  to 
determine  the  effectiveness'of  mutation  analysis  are  described,  and  examples  are 
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Abstract  (continued) 

presented  to  illustrate  the  way  in  which  the  technique  can  be  used  to 
detect  a  wide  class  of  errors,  including  many  previously  defined  and  studied 
in  the  literature.  Finally,  a  number  of  empirical  studies  are  suggested, 
the  results  of  which  may  add  confidence  to  the  outcome  of  the  mutation 
analysis  of  a  program. 


